Skip to content

Cleaning data from data-servers during import

During the call last week we identified two fields of data-server Enterprise data which we could populate as it arrives into a proxy server, based on data from the staging Locavora server:

  • Descriptions in Locavora include HTML, and it would be bad practice for us to trust HTML data which has come from an external source. As such, we should remove or clean HTML tags (currently we just render the HTML as a string)
  • If data is missing for latitude/longitude on an address, we can often populate it using an API. We think we should do this Python-side during the import. We have a JavaScript implementation and a TypeScript implementation in two other projects which we could use for inspiration.