Hypatia Catalog Upcoming Updates - 2024

We are proud to announce the beta-release of our data processing pipeline and API services. The updated site and services are available at new.hypatiacatalog.com, which will be available for a few months to test/debug and then will transition to become the main site. While the user-facing website may look mostly the same, we have included a number of improvements and bug fixes enabled by our new API. We also note that the difference in the number of stars and catalogs between the versions is due to the updated handling of binaries and counting of multiple papers by the same first author in a year, respectively. This transition period where both new and old API are available is provided to give our users flexibility to test and update their existing pipelines. Please reach out with any issues or bugs you encounter, we want to handle the bugs quickly and make the best possible final product for permanent release.

Changes to the API

API users will notice performance improvements and that authentication is no longer required for API access. We also updated the data objects that describe elemental abundance catalogs and solar normalizations, which removed confusing or unused fields. We understand that these changes may cause some pipelines to break and for that we apologize. However, while backwards compatibility was our goal, data quality was sometimes more important. Read more about the API updates at new.hypatiacatalog.com/api.

Public Repository

We are happy to make public the data processing pipeline, API, and multi-service orchestration tools that our team built to enable this update at github.com/HypatiaOrg/HySite. We note that this repository has been designed for our team’s internal use. As such, this repository does not contain the web2py frontend, but will include any new web pages we build in the future.

We are considering developing additional documentation and modes to allow anyone to run their own local instances of the Hypatia database. Currently, all calculations are done on a MongoDB database software using Python as a bridge to website API protocols. The completion time for the most complex calculations in the website’s API is still dominated by latency of the online connection. Providing a local copy of the database, without latency, can be a significant performance for training machine learning models that require complex filtering of the dataset (such as the re-calculation mean/median values when considering subsets of input catalogs). Development of this idea will depend on community interest and the constraints of time and funding.