A web application that aggregates sublet listings from Facebook Groups and Craigslist across multiple cities.
# Clone the repository
git clone this repo
cd casava-v2
# Install dependencies
yarn installcd frontend
yarn dev
cd backend
yarn dev
yarn build
yarn start
- Data Collection Layer: A scraper service that fetches listings from Facebook Groups and Craigslist through a CRON job of every x hours
- Processing Layer: LLM-powered NLP to parse and structure raw listing data
- Storage Layer: PostgreSQL database with Prisma ORM to store structured listing data
- API Layer: Express-based REST API to serve data to the frontend
- Node.js & TS: For strongly-typed server development
- Express: Lightweight web framework for RESTful API endpoints
- Prisma: Modern ORM for database access
- PostgreSQL: Relational database for persistent storage
- Winston: Structured logging
- dotenv: Environment variable management
- Next.js: React framework for production
- TailwindCSS 4.0: Utility-first CSS framework
- shadcn/ui: Component library built on Radix UI
- TypeScript: Type safety throughout the application
- Node.js server runs scheduled scraping tasks every 12 hours to collect fresh listings without overloading source websites
- Scrapes 15-20 most recent posts per source (3-4 Facebook groups and Craigslist pages per city)
- Extracts listing details using LLM without storing original content
- Normalizes data across different sources
- Performs safety evaluation based on listing completeness and quality
- Maintains attribution to original sources
- RESTful endpoints for listing retrieval and filtering
- Cached responses to minimize database queries
- Proper error handling and validation
Node backend:
│ ├── src/ # Source code
│ │ ├── app.ts # Express application setup
│ │ ├── config/ # Configuration files
│ │ ├── controllers/ # API controllers
│ │ ├── middleware/ # Express middleware
│ │ ├── routes/ # API routes
│ │ ├── services/ # Business logic
│ │ ├── types/ # TypeScript type definitions
│ │ └── utils/ # Utility functions
CASAVA follows established legal precedents (hiQ v. LinkedIn, Meta v. Bright Data) by only scraping public data without authentication, extracting non-copyrightable facts rather than creative content, maintaining attribution to original sources, and implementing an immediate compliance protocol for any cease-and-desist requests.
-
hiQ v. LinkedIn (2022): Public data scraping without authentication may be legal under CFAA, but using fake accounts to bypass restrictions remains illegal.
-
Meta v. Bright Data (2024): Platform terms of service don't apply to users scraping public data while logged out.
-
X Corp v. Bright Data (2024): Terms prohibiting scraping may be unenforceable if they attempt to create "private copyright" over non-owned content.
-
Craigslist v. 3taps (2013): Ignoring cease-and-desist notices and circumventing IP blocks violates CFAA.
-
Facebook v. Power Ventures: Copying entire pages containing copyrighted elements creates liability, even when targeting user content.
-
Feist Publications v. Rural Telephone: Facts are not copyrightable; only creative selection and arrangement can be protected.