Parallelize forest parsing for faster initial load

## Problem

Initial forest parsing processes included files sequentially in a single-threaded loop:

**Affected file:**
- `crates/lsp/src/forest.rs:78-202`

**Current flow:**
```rust
while !done {
    for file in to_process.iter() {
        // Parse file (sequential)
        let tree = parser.parse(&text, None).unwrap();
        // Extract data
        let beancount_data = BeancountData::new(&tree, &content);
        // Process includes, add to queue
    }
}
```

**Impact for project with 50 included files:**
- Current: 50 files × 10ms = **500ms initial load**
- Wasted: 7 CPU cores sitting idle while parsing sequentially

## Solution

Use rayon to parse independent files in parallel:

```rust
use rayon::prelude::*;

// Parse all files in current batch in parallel
let results: Vec<_> = to_process
    .par_iter()
    .map(|file| {
        let mut parser = Parser::new();
        parser.set_language(&tree_sitter_beancount::language()).unwrap();
        
        let text = read_file_cached(file, &mut file_cache)?;
        let tree = parser.parse(&text, None).unwrap();
        let content = Rope::from_str(&text);
        let beancount_data = BeancountData::new(&tree, &content);
        
        // Extract include patterns
        let includes = extract_includes(&tree, &text, file);
        
        Ok((file.clone(), tree, beancount_data, includes))
    })
    .collect();

// Process results, collect new includes
for result in results {
    // Send to main thread, update state
    // Add discovered includes to next batch
}
```

**Considerations:**
- Each thread needs its own Parser (not thread-safe)
- Mutex/channel for sending results back
- Progress reporting needs thread-safe counter

## Expected Impact

- **8x faster initial load** on 8-core system (500ms → 62ms)
- Better CPU utilization
- Scales with available cores

## Acceptance Criteria

- [ ] Forest parsing uses rayon for parallel processing
- [ ] Each worker thread has its own Parser
- [ ] Progress reporting works correctly with parallel execution
- [ ] File cache is thread-safe (or per-thread)
- [ ] Existing tests pass
- [ ] No race conditions or data races
- [ ] Benchmark showing speedup with multiple files

## Effort Estimate

~4 hours

## Priority

**Medium** - High impact but only affects initial load, not ongoing usage

## Notes

- Consider using `ThreadPoolBuilder` to limit max threads
- Need to handle errors gracefully from parallel workers
- Progress reporting might need adjustment for concurrent updates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelize forest parsing for faster initial load #758

Problem

Solution

Expected Impact

Acceptance Criteria

Effort Estimate

Priority

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Parallelize forest parsing for faster initial load #758

Description

Problem

Solution

Expected Impact

Acceptance Criteria

Effort Estimate

Priority

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions