Glimpse is a data engineering project designed to:
- Read in latest news snippets from source API
- Combined and process text, metadata into a single input
- Call LLM API to generate a single paragraph prompt
- Use prompt to feed into image generation API
- Use social platform API to automatically generate content daily
Components that are not IaC:
- Updating of AWS credentials/IAM user
- SNS topic set up + subscriptions
- Pandas lambda layer
- OpenAI lambda layer
- Creation of environment variables for content create lambda
- Posting of content
- Currents API Documentation
- AWS Console Login
- DALL-E Dashboard Not in used anymore
- Leonardo.AI API Documentation
- Create and branch off new issue
- Install all local dependencies in
requirements.txtas well as setting upserverlesslocally - Use
#%%magic from vscode jupyter extension to run isolated lambda functions - Replicate variables locally using sample files
- To test, upload a sample
raw_feed.jsonfrom local into glimpse-landing-dev through the AWS console. This should kick off the pipeline automatically - If everything runs correctly, an email should be sent to
jtsw1990@gmail.comwith the content feed - If not, review the logs, check each lambda's latest timestamp to identify error messages
- Delete the
raw_feed.jsonfrom glimpse-landing-dev andfeature.jsonfrom glimpse-feature-store if applicable to keep things clean - Repeat steps
3-8until tests run as expected - Run
ruff check . --fixto highlight any linting issues - Run
sls deployto push latest adjustments to AWS (Note the components not included in IAC above and apply accordingly) - Run git workflow to push to feature branch
- Merge back into main
- Become a wizard in building infrastructure
- To know the right practices and tools to avoid running notebooks manually in datascience projects
- To be able to weigh options for different solutions given a specific stack and situation
- Have fun learning and hopefully build something cool along the way
- Get used to the standard git development process (TBD) which will help with work
- Create a personal project template that can be reused
- Add an element of content creation to this
