@@ -12,22 +12,29 @@ Traditional UI testing is painful:
1212- ** Pixel-perfect comparisons** fail on minor, acceptable variations
1313- ** Writing test assertions** requires deep technical knowledge
1414- ** Cross-browser testing** multiplies complexity
15+ - ** Generic analysis lacks domain expertise** - accessibility, conversion optimization, mobile UX
1516- ** Accessibility checks** need specialized tools and expertise
1617
1718## The Solution
1819
19- LayoutLens lets you test UIs the way humans see them - using natural language and visual understanding :
20+ LayoutLens lets you test UIs the way humans see them - using natural language and domain expert knowledge :
2021
2122``` python
22- result = lens.analyze(" https://example.com" , " Is the navigation user-friendly?" )
23- # Returns: "Yes, the navigation is clean and intuitive with clear labels"
23+ # Basic analysis
24+ result = await lens.analyze(" https://example.com" , " Is the navigation user-friendly?" )
25+
26+ # Expert-powered analysis
27+ result = await lens.audit_accessibility(" https://example.com" , compliance_level = " AA" )
28+ # Returns: "WCAG AA compliant with 4.7:1 contrast ratio. Focus indicators visible..."
2429```
2530
2631Instead of writing complex selectors and assertions, just ask questions like:
2732- "Is this page mobile-friendly?"
2833- "Are all buttons accessible?"
2934- "Does the layout look professional?"
3035
36+ Get expert-level insights from built-in domain knowledge in ** accessibility** , ** conversion optimization** , ** mobile UX** , and more.
37+
3138** ✅ 95.2% accuracy** on real-world UI testing benchmarks
3239
3340## Quick Start
@@ -46,7 +53,7 @@ from layoutlens import LayoutLens
4653lens = LayoutLens()
4754
4855# Test any website or local HTML
49- result = lens.analyze(" https://your-site.com" , " Is the header properly aligned?" )
56+ result = await lens.analyze(" https://your-site.com" , " Is the header properly aligned?" )
5057print (f " Answer: { result.answer} " )
5158print (f " Confidence: { result.confidence:.1% } " )
5259```
@@ -59,43 +66,60 @@ That's it! No selectors, no complex setup, just natural language questions.
5966Test single pages with custom questions:
6067``` python
6168# Test local HTML files
62- result = lens.analyze(" checkout.html" , " Is the payment form user-friendly?" )
69+ result = await lens.analyze(" checkout.html" , " Is the payment form user-friendly?" )
70+
71+ # Test with expert context
72+ from layoutlens.prompts import Instructions, UserContext
73+
74+ instructions = Instructions(
75+ expert_persona = " conversion_expert" ,
76+ user_context = UserContext(
77+ business_goals = [" reduce_cart_abandonment" ],
78+ target_audience = " mobile_shoppers"
79+ )
80+ )
6381
64- # Test with different viewports
65- result = lens.analyze(
66- " homepage.html" ,
67- " How does this look on mobile?" ,
68- viewport = " mobile_portrait"
82+ result = await lens.analyze(
83+ " checkout.html" ,
84+ " How can we optimize this checkout flow?" ,
85+ instructions = instructions
6986)
7087```
7188
7289### 2. Compare Layouts
7390Perfect for A/B testing and redesign validation:
7491``` python
75- result = lens.compare(
92+ result = await lens.compare(
7693 [" old-design.html" , " new-design.html" ],
7794 " Which design is more accessible?"
7895)
7996print (f " Winner: { result.answer} " )
8097```
8198
82- ### 3. Built-in Checks
83- Common tests with one line of code:
99+ ### 3. Expert-Powered Analysis
100+ Domain expert knowledge with one line of code:
84101``` python
85- # Accessibility compliance
86- result = lens.check_accessibility (" product-page.html" )
102+ # Professional accessibility audit (WCAG expert)
103+ result = await lens.audit_accessibility (" product-page.html" , compliance_level = " AA " )
87104
88- # Mobile responsiveness
89- result = lens.check_mobile_friendly(" landing.html" )
105+ # Conversion rate optimization (CRO expert)
106+ result = await lens.optimize_conversions(" landing.html" ,
107+ business_goals = [" increase_signups" ], industry = " saas" )
90108
91- # Conversion optimization
92- result = lens.check_conversion_optimization(" checkout.html" )
109+ # Mobile UX analysis (Mobile expert)
110+ result = await lens.analyze_mobile_ux(" app.html" , performance_focus = True )
111+
112+ # E-commerce audit (Retail expert)
113+ result = await lens.audit_ecommerce(" checkout.html" , page_type = " checkout" )
114+
115+ # Legacy methods still work
116+ result = await lens.check_accessibility(" product-page.html" ) # Backward compatible
93117```
94118
95119### 4. Batch Testing
96120Test multiple pages efficiently:
97121``` python
98- results = lens.analyze_batch (
122+ results = await lens.analyze (
99123 sources = [" home.html" , " about.html" , " contact.html" ],
100124 queries = [" Is it accessible?" , " Is it mobile-friendly?" ]
101125)
@@ -105,36 +129,86 @@ results = lens.analyze_batch(
105129### 5. High-Performance Async (3-5x faster)
106130``` python
107131# Async for maximum throughput
108- result = await lens.analyze_batch_async (
132+ result = await lens.analyze (
109133 sources = [" page1.html" , " page2.html" , " page3.html" ],
110134 queries = [" Is it accessible?" ],
111135 max_concurrent = 5
112136)
113137```
114138
115- ## CLI Usage (v1.4.0 - Async-by-Default)
139+ ### 6. Structured JSON Output
140+ All results provide clean, typed JSON for automation:
141+ ``` python
142+ result = await lens.analyze(" page.html" , " Is it accessible?" )
143+
144+ # Export to clean JSON
145+ json_data = result.to_json() # Returns typed JSON string
146+ print (json_data)
147+ # {
148+ # "source": "page.html",
149+ # "query": "Is it accessible?",
150+ # "answer": "Yes, the page follows accessibility standards...",
151+ # "confidence": 0.85,
152+ # "reasoning": "The page has proper heading structure...",
153+ # "screenshot_path": "/path/to/screenshot.png",
154+ # "viewport": "desktop",
155+ # "timestamp": "2024-01-15 10:30:00",
156+ # "execution_time": 2.3,
157+ # "metadata": {}
158+ # }
159+
160+ # Type-safe structured access
161+ from layoutlens.types import AnalysisResultJSON
162+ import json
163+ data: AnalysisResultJSON = json.loads(result.to_json())
164+ confidence = data[" confidence" ] # Fully typed: float
165+ ```
166+
167+ ### 7. Domain Experts & Rich Context
168+ Choose from 6 built-in domain experts with specialized knowledge:
169+ ``` python
170+ # Available experts: accessibility_expert, conversion_expert, mobile_expert,
171+ # ecommerce_expert, healthcare_expert, finance_expert
172+
173+ # Use any expert with custom analysis
174+ result = await lens.analyze_with_expert(
175+ source = " healthcare-portal.html" ,
176+ query = " How can we improve patient experience?" ,
177+ expert_persona = " healthcare_expert" ,
178+ focus_areas = [" patient_privacy" , " health_literacy" ],
179+ user_context = {
180+ " target_audience" : " elderly_patients" ,
181+ " accessibility_needs" : [" large_text" , " simple_navigation" ],
182+ " industry" : " healthcare"
183+ }
184+ )
116185
117- ``` bash
118- # Quick test with concurrent processing
119- layoutlens test --page example.com --queries " Is this accessible?"
186+ # Expert comparison analysis
187+ result = await lens.compare_with_expert(
188+ sources = [" old-design.html" , " new-design.html" ],
189+ query = " Which design converts better?" ,
190+ expert_persona = " conversion_expert" ,
191+ focus_areas = [" cta_prominence" , " trust_signals" ]
192+ )
193+ ```
120194
121- # Test with multiple viewports concurrently
122- layoutlens test --page mysite.com --queries " Good mobile UX?" --viewports " mobile_portrait,desktop"
195+ ## CLI Usage
123196
124- # Compare designs with async processing
125- layoutlens compare before.html after.html
197+ ``` bash
198+ # Analyze a single page
199+ layoutlens https://example.com " Is this accessible?"
126200
127- # Batch process multiple sources efficiently
128- layoutlens batch --sources " site1.com,site2.com " --queries " Is it accessible ?"
201+ # Analyze local files
202+ layoutlens page.html " Is the design professional ?"
129203
130- # Interactive mode with Rich terminal formatting
131- layoutlens interactive
204+ # Compare two designs
205+ layoutlens page1.html page2.html --compare
132206
133- # Generate config template
134- layoutlens generate config
207+ # Analyze with different viewport
208+ layoutlens site.com " Is it mobile-friendly? " --viewport mobile
135209
136- # Check system status and API keys
137- layoutlens info
210+ # JSON output for automation
211+ layoutlens page.html " Is it accessible? " --output json
138212```
139213
140214## CI/CD Integration
@@ -145,42 +219,116 @@ layoutlens info
145219 run : |
146220 pip install layoutlens
147221 playwright install chromium
148- layoutlens test --page ${{ env.PREVIEW_URL }} \
149- --queries "Is it accessible?,Is it mobile-friendly?"
222+ layoutlens ${{ env.PREVIEW_URL }} "Is it accessible and mobile-friendly?"
150223` ` `
151224
152225### Python Testing
153226` ` ` python
154227import pytest
155228from layoutlens import LayoutLens
156229
157- def test_homepage_quality() :
230+ @pytest.mark.asyncio
231+ async def test_homepage_quality() :
158232 lens = LayoutLens()
159- result = lens.analyze("homepage.html", "Is this production-ready?")
233+ result = await lens.analyze("homepage.html", "Is this production-ready?")
160234 assert result.confidence > 0.8
161235 assert "yes" in result.answer.lower()
162236` ` `
163237
238+ ## Benchmark & Evaluation Workflow
239+
240+ LayoutLens includes a comprehensive benchmarking system to validate AI performance:
241+
242+ ### 1. Generate Benchmark Results
243+ ` ` ` bash
244+ # Run LayoutLens against test data
245+ python benchmarks/run_benchmark.py --api-key sk-your-key
246+
247+ # With custom settings
248+ python benchmarks/run_benchmark.py \
249+ --api-key sk-your-key \
250+ --output benchmarks/my_results \
251+ --no-batch \
252+ --filename custom_results.json
253+ ```
254+
255+ ### 2. Evaluate Performance
256+ ``` bash
257+ # Evaluate results against ground truth
258+ python benchmarks/evaluation/evaluator.py \
259+ --answer-keys benchmarks/answer_keys \
260+ --results benchmarks/layoutlens_output \
261+ --output evaluation_report.json
262+ ```
263+
264+ ### 3. Structured Benchmark Results
265+ The benchmark runner outputs clean JSON for analysis:
266+ ``` python
267+ # Example benchmark result structure
268+ {
269+ " benchmark_info" : {
270+ " total_tests" : 150 ,
271+ " successful_tests" : 143 ,
272+ " failed_tests" : 7 ,
273+ " success_rate" : 0.953 ,
274+ " batch_processing_used" : true,
275+ " model_used" : " gpt-4o-mini"
276+ },
277+ " results" : [
278+ {
279+ " html_file" : " good_contrast.html" ,
280+ " query" : " Is this page accessible?" ,
281+ " answer" : " Yes, the page has good color contrast..." ,
282+ " confidence" : 0.89 ,
283+ " reasoning" : " WCAG guidelines are followed..." ,
284+ " success" : true,
285+ " error" : null,
286+ " metadata" : {" category" : " accessibility" }
287+ }
288+ ]
289+ }
290+ ```
291+
292+ ### 4. Custom Benchmarks
293+ Create your own test data and answer keys:
294+ ``` python
295+ # Use the async API for custom benchmark workflows
296+ from layoutlens import LayoutLens
297+
298+ async def run_custom_benchmark ():
299+ lens = LayoutLens()
300+
301+ test_cases = [
302+ {" source" : " page1.html" , " query" : " Is it accessible?" },
303+ {" source" : " page2.html" , " query" : " Is it mobile-friendly?" }
304+ ]
305+
306+ results = []
307+ for case in test_cases:
308+ result = await lens.analyze(case[" source" ], case[" query" ])
309+ results.append({
310+ " test" : case,
311+ " result" : result.to_json(), # Clean JSON output
312+ " passed" : result.confidence > 0.7
313+ })
314+
315+ return results
316+ ```
317+
164318## Configuration
165319
166- LiteLLM unified provider support with configuration options:
320+ Simple configuration options:
167321``` python
168322# Via environment
169323export OPENAI_API_KEY = " sk-..."
170324
171- # Via code with LiteLLM unified providers
325+ # Via code
172326lens = LayoutLens(
173327 api_key = " sk-..." ,
174328 model = " gpt-4o-mini" , # or "gpt-4o" for higher accuracy
175- provider="openai", # "openai", "anthropic", "google", "gemini", "litellm"
176329 cache_enabled = True , # Reduce API costs
177330 cache_type = " memory" , # "memory" or "file"
178331)
179-
180- # Provider examples using LiteLLM unified interface
181- lens = LayoutLens(provider="anthropic", model="anthropic/claude-3-5-sonnet")
182- lens = LayoutLens(provider="google", model="google/gemini-1.5-pro")
183- lens = LayoutLens(provider="litellm", model="gpt-4o") # Direct LiteLLM access
184332```
185333
186334## Resources
@@ -193,11 +341,14 @@ lens = LayoutLens(provider="litellm", model="gpt-4o") # Direct LiteLLM access
193341## Why LayoutLens?
194342
195343- ** Natural Language** - Write tests like you'd describe the UI to a colleague
344+ - ** Domain Expert Knowledge** - Built-in expertise in accessibility, CRO, mobile UX, and more
345+ - ** Rich Context Support** - Business goals, user personas, compliance standards, and technical constraints
196346- ** Zero Selectors** - No more fragile XPath or CSS selectors
197347- ** Visual Understanding** - AI sees what users see, not just code
198348- ** Async-by-Default** - Concurrent processing for optimal performance
199- - ** Multiple AI Providers** - Support for OpenAI, Anthropic, Google via LiteLLM
200- - ** Interactive Mode** - Real-time analysis with Rich terminal formatting
349+ - ** Simple API** - One analyze method handles single pages, batches, and comparisons
350+ - ** Structured JSON Output** - TypedDict schemas for full type safety in automation
351+ - ** Comprehensive Benchmarking** - Built-in evaluation system with 95.2% accuracy
201352- ** Production Ready** - Used by teams for real-world applications
202353
203354---
0 commit comments