Hypatia Catalog Upcoming Updates - 2024
We are proud to announce the beta-release of our data processing pipeline and API services.
The updated site and services are available at
new.hypatiacatalog.com, which will
be available for a few months to test/debug and then will transition to become the main site.
While the user-facing website may look mostly the same, we have included
a number of improvements and bug fixes enabled by our new API. We also note that
the difference in the number of stars and catalogs between the versions is due to the updated
handling of binaries and counting of multiple papers by the same first author in a year, respectively.
This transition period where both new and old API are available is provided
to give our users flexibility to test and update their existing pipelines.
Please reach out with any issues or bugs you encounter,
we want to handle the bugs quickly and make
the best possible final product for permanent release.
Changes to the API
API users will notice performance improvements
and that authentication is no longer required for API access.
We also updated the data objects
that describe elemental abundance catalogs and solar normalizations, which
removed confusing or unused fields.
We understand that these changes may cause some pipelines to break and for that
we apologize. However, while backwards compatibility was our goal,
data quality was sometimes more important.
Read more about the API updates at
new.hypatiacatalog.com/api.
Public Repository
We are happy to make public the data processing pipeline, API,
and multi-service orchestration tools that our team built
to enable this update at
github.com/HypatiaOrg/HySite.
We note that this repository has been designed for our team’s internal use.
As such, this repository does not contain the web2py frontend,
but will include any new web pages we build in the future.
We are considering developing additional documentation
and modes to allow anyone to run their own local instances of the Hypatia database.
Currently, all calculations are done on a
MongoDB database software using Python as a bridge
to website API protocols.
The completion time for the most complex calculations in the website’s API
is still dominated by latency of the online connection.
Providing a local copy of the database, without latency,
can be a significant performance for training machine learning models
that require complex filtering of the dataset
(such as the re-calculation mean/median values when considering subsets of input catalogs).
Development of this idea will depend on community interest and the constraints of time and funding.