- 
                Notifications
    You must be signed in to change notification settings 
- Fork 131
feat: lesson about using a framework #1303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cb0f718    to
    e18ea31      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your point for avoiding type hints. However, in the case of the handler:
    @crawler.router.default_handler
    async def handle_listing(context):
        ...It leaves the reader without any possibility for code completions or static analysis when working with the context object.
In my opinion, type hints should be included here. We have been using them across all docs & examples.
Just a suggestion for you to reconsider, not a request.
Other than that, good job 🙂, and the code seems to be working.
| Thanks for the review! I see your point and I will indeed reconsider adding the type hint, at least for the context. It would be easier decision if the type name wasn't 28 characters long, but you're right about the benefits for people with editors like VS Code, where we could assume some level of automatic code completions. | 
        
          
                sources/academy/webscraping/scraping_basics_python/12_framework.md
              
                Outdated
          
            Show resolved
            Hide resolved
        
      See #1303 (comment) Co-authored-by: Ondra Urban <[email protected]>
See #1303 (comment) Co-authored-by: Ondra Urban <[email protected]>
c55010a    to
    23fdbdb      
    Compare
  
    | @vdusek I added reconsidering the type hints to #1319, thanks! @TC-MO I think we could merge this now, but I'd appreciate if you could take a look at what I did, at least in f0c6041. I randomly added words to Vale's spelling dictionary, but even then, I had to turn it off for one code block. According to my testing it seems that Vale isn't able to identify and skip the block as a code block if it has  I asked about the dictionary also here https://github.com/apify/apify-docs/pull/1345/files#r1922397733 | 
| not sure about titling the codeblock and vale issues, never encountered it before, let me investigate, but for a quick fix I guess turning vale off & on after codeblock will do. If we find a proper solution it will be an easy change | 
| Thanks! I'm surprised, because I think this isn't the first time it's used in the docs, but it's also possible that Vale works incrementally, so it errors only when I'm adding it, and ignores whatever is already there. Not sure. | 
| The way Vale is set up is , it doesn't check whole docuemntation each time it runs, it just checks the changed files, so that is why some other files may have had title in codeblock, but since they have not been changed in this PR it isn't caught. ALso this might be the result of Vale Spelling being added. | 
| That's what I think as well. | 
Only for the framework lesson, previously discussed here #1303 (review)
Only for the framework lesson, previously discussed here #1303 (review)
This PR introduces a new lesson to the Python course for beginners in scraping. The lesson is about working with a framework. Decisions I made:
Crawlee feedback
Regarding Crawlee, I didn't have much trouble to write this lesson, apart from the part where I wanted to provide hints on how to do this:
I couldn't find good example in the docs, and I was afraid that even if I provided pointers to all the individual pieces, the student wouldn't be able to figure it out.
Also, I wanted to link to docs when pointing out the fact that
enqueue_links()has alimitargument, but I couldn't findenqueue_links()in the docs. I found this which is weird. It's not clear what object is documented, or what it is, feels like some internals, not as regular docs of a method. I probably know how come it's this way, but I don't think it's useful this way and I decided I don't want to send people from the course to that page.One more thing: I do think that Crawlee should log some "progress" information about requests made or - especially - items scraped. It's so weird to run the program and then just look at the program as if it hanged, waiting if something happens or not. E.g. Scrapy logs how many items per minute I scraped, which I personally find super useful.