Module 2 – Run the data pipeline
Now that you have fully configured your environment, it’s time to process and upload the movie data.
In this module, you will download the dataset, clean and normalize the text, generate embeddings using the model you selected earlier, and upload both the vectors and full movie metadata to your deployed Chromia chain.
By the end of this module, you will have populated your on-chain database, making it ready to support semantic search.
Activate vector_demo_env
if it's not already active
If you're in the rell/
folder:
cd ../python
Or from the root of the project:
cd python
Then activate the environment:
source vector_demo_env/bin/activate