What data fields are available for analysis?

Data retrieved and processed by Cauldron are stored in Elasticsearch indices, which some specific fields you can use for querying, filtering, or building visualizations. Which indices are available and which fields do they include?

These are the ones created from the data sources you include in your analysis:

  • git_enrich from git repositories data related with commits activity: data schema.
  • git_aoc_enrich from git repositories data related with file changes: data schema
  • github_enrich from Github issues and pull requests activity: data schema
  • gitlab_enrich from GitLab issues and merge requests activity: data schema
  • meetup_enriched from Meetup events activity: data schema
  • ocean is an aggregation of previous indices (something called alias in Elasticsearch terminology)

Probably more will be added and the existing ones might be updated in upcoming releases of Cauldron. I’ll do my best to keep this list updated, but if you find any issue or you have any doubt, feel free to comment :wink: