Machine Learning

Introducing EuclidesDB – A machine learning feature database

Past week I released the first public version of EuclidesDB. EuclidesDB is a multi-model machine learning feature database that is tightly coupled with PyTorch and provides a backend for including and querying data on the model feature space.

For more information, see the GitHub repository or the documentation.

Some features of EuclidesDB are listed below:

  • Written in C++ for performance;
  • Uses protobuf for data serialization;
  • Uses gRPC for communication;
  • LevelDB integration for database serialization;
  • Many indexing methods implemented (AnnoyFaiss, etc);
  • Tight PyTorch integration through libtorch;
  • Easy integration for new custom fine-tuned models;
  • Easy client language binding generation;
  • Free and open-source with permissive license;

And here is a diagram of the overall architecture: