Skip to content

Commit a70f479

Browse files
authored
Merge branch 'main' into develop
2 parents 0a656db + 45decf7 commit a70f479

18 files changed

+16699
-5769
lines changed
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
---
2+
title: New Documentation Site for IntelOwl and friends.
3+
date: 2024-08-15
4+
cover: /images/gsoclogo.png
5+
author: Aryan Bhokare
6+
---
7+
8+
## Introduction
9+
10+
As a Full Stack Web Developer with a keen interest in security, I was immediately drawn to IntelOwl due to its real-world applicability and robust feature set. I began contributing to the project in January 2024, focusing primarily on frontend issues and the addition of analyzers, under the guidance of Matteo.
11+
12+
### Pre-GSOC Commits/Discussions.
13+
14+
```(js)
15+
#2092: [Analyzer] IP2Location
16+
#2166: Added table cell component and fixed text-wrapping issue.
17+
```
18+
19+
I was later introduced to an issue related to IntelOwl’s main [documentation site](https://github.com/intelowlproject/IntelOwl/issues/2043). I resonated with the approach discussed by Matteo, conducted thorough research, and developed a proposal that the mentors appreciated, leading to my selection for GSoC.
20+
21+
According to my initial proposal, my objectives were:
22+
23+
> Develop a new documentation site with custom themes for an enhanced UI experience.
24+
> Integrate Swagger UI for API specifications.
25+
> Centralize documentation for all repositories within the IntelOwl project.
26+
> Add docstrings for dynamic documentation and contribute guides for project contribution and usage.
27+
28+
# GSoC Deliverables and Tasks
29+
30+
I planned and successfully completed the following tasks during GSoC 2024, with the support of my mentors, Matteo Lodi and Daniel Rosetti. Below is an expansion on each task, the challenges I encountered, and the learning experiences gained.
31+
32+
As it was a new repository I was given permission to directly push to the repository so Instead of prs to show my work here’s the list of [commits](https://github.com/intelowlproject/docs/commits/main/?author=aryan-bhokare).
33+
34+
### [IntelOwl Project’s Documentation Website](https://intelowlproject.github.io/docs/)
35+
36+
My first task was to design the UI of the documentation site using MkDocs. After discussing with the mentors, we settled on using the Material theme. Upon completing the basic site structure, I collaborated with the mentors to finalize a visually appealing custom theme.
37+
38+
here is the [website](https://intelowlproject.github.io/docs/).
39+
40+
### Docstrings Integration.
41+
42+
Integrating docstrings dynamically into MkDocs using the mkdocstrings package in Python was a complex task.
43+
44+
The challenge arose primarily due to our need for a centralised documentation site. Finding the right approach was difficult, but after some research, we discovered the [mkdocs-monorepo-plugin](https://github.com/backstage/mkdocs-monorepo-plugin), which helped facilitate the integration.
45+
46+
After several iterations, I successfully integrated the plugin, resulting in a more comprehensive and informative documentation site.
47+
48+
### Submodules Integration
49+
50+
Our previous solution had many flaws, as it was not fully compatible with docstrings, and there were issues with CSS not being rendered. Initially, our approach involved having a separate documentation site for each repository and then integrating all the sites into our centralized site.
51+
52+
However, we later decided to move away from this approach and explore other options. During further research, we came across Git submodules, which fit perfectly with our requirements.
53+
54+
One significant challenge was dynamically fetching documentation and docstrings from various IntelOwl repositories to avoid redundant updates. While implementing submodules came with its own set of challenges like how to keep the submodules consistent with latest commits and how will the code will be fetched, I was able to overcome them successfully and implement this [github action](https://github.com/intelowlproject/docs/blob/main/.github/workflows/deploy_and_update_submodules.yml) which handles it.
55+
56+
### [Swagger UI Integration](https://intelowlproject.github.io/docs/IntelOwl/api_docs/)
57+
58+
The integration of Swagger UI for API specs was straightforward, especially after resolving the dynamic update issue with submodules. I also added a dark mode feature to ensure consistency with the overall theme of the documentation site.
59+
60+
Link to [SwaggerUI api-docs](https://intelowlproject.github.io/docs/IntelOwl/api_docs/)
61+
62+
## Deployment Using GitHub Pages
63+
64+
Deploying the site using GitHub Pages was relatively easy, thanks to a pre-existing [GitHub Action](https://github.com/marketplace/actions/deploy-mkdocs) for MkDocs deployment.
65+
66+
However, ensuring that submodules were updated before deployment was crucial. I explored several approaches to trigger the main repo to fetch updates from child repos upon commits, but this proved complex.
67+
68+
This [github action](https://github.com/intelowlproject/docs/blob/main/.github/workflows/deploy_and_update_submodules.yml) handles all the updation required.
69+
70+
## Addition of Docstrings
71+
72+
In line with my proposal, I dedicated time to adding comprehensive docstrings across the IntelOwl codebase to leverage the mkdocstrings integration fully. Given the time-intensive nature of writing docstrings, I worked on this in parallel with other tasks.
73+
74+
Link to [PR](https://github.com/intelowlproject/IntelOwl/pull/2430)
75+
76+
## Working and Contribution Guide for New Documentation
77+
78+
My final task involved creating a comprehensive guide for contributing to and working with the new documentation site. After discussions with Matteo and Daniel, we agreed on the structure and flow of the guides, including an example of integrating docstrings into the codebase.
79+
80+
Link to [Guides](https://intelowlproject.github.io/docs/Guide-documentation/)
81+
82+
## Ending Note and Next Steps
83+
84+
Participating in GSoC has been an incredibly enriching experience. I gained far more knowledge than I anticipated, not only in technical aspects but also in communication and time management, particularly in handling unexpected challenges.
85+
86+
Throughout the program, my mentors provided invaluable support, ensuring smooth communication and timely resolution of any issues. This enabled me to stay on track and complete my tasks effectively.
87+
88+
Looking forward, I am eager to continue contributing to open-source projects, particularly within the IntelOwl organization. I have several ideas for new features to further enhance the project’s documentation site. It’s deeply fulfilling to contribute to the community that has been instrumental in my learning journey.
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
---
2+
title: New Analyzers for IntelOwl.
3+
date: 2024-08-19
4+
cover: /images/gsoclogo.png
5+
author: Nilay Gupta
6+
---
7+
8+
## Introduction
9+
10+
As an engineer, I'm always on the lookout for interesting projects and products. One such project that caught my eye was Honeynet's IntelOwl Project. I'll keep this blog short and crisp, elucidating all my contributions since then.
11+
12+
### Pre-GSOC Commits/Discussions
13+
14+
| PR Number | Title |
15+
| --------- | ----- |
16+
| [#2209](https://github.com/intelowlproject/IntelOwl/pull/2209) | Tweet feedsfixes#1770 |
17+
| [#2178](https://github.com/intelowlproject/IntelOwl/pull/2178) | Fixes bgp ranking#1901 |
18+
| [#2126](https://github.com/intelowlproject/IntelOwl/pull/2126) | Feodo tracker#1103 |
19+
| [#2164](https://github.com/intelowlproject/IntelOwl/pull/2164) | Misp, closes #1955 |
20+
| [#2161](https://github.com/intelowlproject/IntelOwl/pull/2161) | Pinning image version of Phoneinfoga Analyzer |
21+
| [#2148](https://github.com/intelowlproject/IntelOwl/pull/2148) | Boolean toggle |
22+
| [#2115](https://github.com/intelowlproject/IntelOwl/pull/2115) | Validin#1966 |
23+
| [#2108](https://github.com/intelowlproject/IntelOwl/pull/2108) | Zippy_scan closes #1951 |
24+
| [#2107](https://github.com/intelowlproject/IntelOwl/pull/2107) | PhoneInfoga#995 |
25+
| [#2096](https://github.com/intelowlproject/IntelOwl/pull/2096) | Update censys.io, Closes #439 |
26+
| [#2080](https://github.com/intelowlproject/IntelOwl/pull/2080) | Mmdb server, closes #1779 |
27+
| [#19](https://github.com/intelowlproject/intelowlproject.github.io/pull/19) | fixed Scroll Bar Appearance |
28+
29+
As can be noticed, my contributions were pretty heavy on developing and fixing analyzers. Inevitably, the project I chose was developing **New Analyzers for IntelOwl**.
30+
31+
In my proposal, I proposed to develop around 30 new analyzers for the community of IntelOwl users.
32+
33+
# GSoC Deliverables and Tasks
34+
35+
As anticipated, my proposal was selected, and I was assigned the project. One of my mentors, and the owner of IntelOwl, Matteo Lodi, created a [GitHub Project/Kanban board](https://github.com/orgs/intelowlproject/projects/11/). All individual issues solved, pull requests and commits cab be accessed using the board.
36+
37+
I'll now proceed to elaborate on all the significant PRs mentioned.
38+
39+
- **Blint Analyzer [PR #2257](https://github.com/intelowlproject/IntelOwl/pull/2257) :**
40+
[Blint](https://github.com/owasp-dep-scan/blint) is a Binary Linter that checks the security properties and capabilities of your executables. Supported binary formats: - Android (apk, aab) - ELF (GNU, musl) - PE (exe, dll) - Mach-O (x64, arm64).
41+
- **HudsonRock Analyzer [PR #2327](https://github.com/intelowlproject/IntelOwl/pull/2327) :**
42+
[Hudson Rock](https://cavalier.hudsonrock.com/docs) provides its clients the ability to query a database of over 27,541,128 computers which were compromised through global info-stealer campaigns performed by threat actors.
43+
- **CyCat Analyzer [PR #2328](https://github.com/intelowlproject/IntelOwl/pull/2328/) :**
44+
[CyCat](https://cycat.org/) or the CYbersecurity Resource CATalogue aims at mapping and documenting, in a single formalism and catalogue available cybersecurity tools, rules, playbooks, processes and controls.
45+
- **Vulners Analyzer [PR #2340](https://github.com/intelowlproject/IntelOwl/pull/2340) :**
46+
[Vulners](vulners.com) is the most complete and the only fully correlated security intelligence database, which goes through constant updates and links 200+ data sources in a unified machine-readable format. It contains 8 mln+ entries, including CVEs, advisories, exploits, and IoCs — everything you need to stay abreast on the latest security threats.
47+
- **Ailtyposquatting Analyzer [PR #2341](https://github.com/intelowlproject/IntelOwl/pull/2341) :**
48+
[AILTypoSquatting](https://github.com/typosquatter/ail-typo-squatting) is a Python library to generate list of potential typo squatting domains with domain name permutation engine to feed AIL and other systems.
49+
- **DetectItEasy Analyzer [PR #2354](https://github.com/intelowlproject/IntelOwl/pull/2354) :**
50+
[DetectItEasy](https://github.com/horsicq/Detect-It-Easy) is a program for determining types of files.
51+
- **Malprob Analyzer [PR #2357](https://github.com/intelowlproject/IntelOwl/pull/2357) :**
52+
[Malprob](https://malprob.io/) is a leading malware detection and identification service, powered by cutting-edge AI technology.
53+
- **AdGuard Analyzer [PR #2363](https://github.com/intelowlproject/IntelOwl/pull/2363) :**
54+
[Adguard](https://github.com/AdguardTeam/AdguardSDNSFilter), a filter composed of several other filters (AdGuard Base filter, Social media filter, Tracking Protection filter, Mobile Ads filter, EasyList and EasyPrivacy) and simplified specifically to be better compatible with DNS-level ad blocking.
55+
- **Auto creation default test user with debug=true [PR #2369](https://github.com/intelowlproject/IntelOwl/pull/2369) :**
56+
Auto create an Admin user whenever IntelOwl starts up for the first time to avoid user creation on every new build while development.
57+
- **Spamhaus_WQS Analyzer [PR #2378](https://github.com/intelowlproject/IntelOwl/pull/2378) :**
58+
[Spamhaus_WQS](https://docs.spamhaus.com/datasets/docs/source/70-access-methods/web-query-service/000-intro.html) : The Spamhaus Web Query Service (WQS) is a method of accessing Spamhaus block lists using the HTTPS protocol.
59+
- **Crt_sh Analyzer [PR #2379](https://github.com/intelowlproject/IntelOwl/pull/2379) :**
60+
[Crt_Sh](https://crt.sh/) lets you get certificates info about a domain.
61+
- **Orkl_search Analyzer [PR #2380](https://github.com/intelowlproject/IntelOwl/pull/2380) :**
62+
[Orkl](https://orkl.eu/) is the Community Driven Cyber Threat Intelligence Library.
63+
- **Goresym Analyzer, fixes#1451 and fixes executable file support [PR #2401](https://github.com/intelowlproject/IntelOwl/pull/2401) :**
64+
- [GoReSym](https://github.com/mandiant/GoReSym) is a Go symbol parser that extracts program metadata (such as CPU architecture, OS, endianness, compiler version, etc), function metadata (start & end addresses, names, sources), filename and line number metadata, and embedded structures and types.
65+
66+
I fixed an important bug which involed correcting support for mimetype `application/vnd.microsoft.portable-executable` and `application/x-dosexec`. I had to migrate back, run a query to find all the analyzers that supported `application/x-executable` in previously, use the resultant list to migrate and update all the specific analyzers.
67+
- **JA4_DB Analyzer [PR #2402](https://github.com/intelowlproject/IntelOwl/pull/2402) :**
68+
[JA4_DB](https://ja4db.com/) lets you search a fingerprint in JA4 databse.
69+
- **Spamhaus_drop Analyzer [PR #2422](https://github.com/intelowlproject/IntelOwl/pull/2422) :**
70+
[Spamhaus_DROP](https://www.spamhaus.org/blocklists/do-not-route-or-peer/) protects from activity directly originating from rogue networks, such as spam campaigns, encryption via ransomware, DNS-hijacking and exploit attempts, authentication attacks to discover working access credentials, harvesting, DDoS attacks.
71+
- **Leakix Analyzer [PR #2423](https://github.com/intelowlproject/IntelOwl/pull/2423) :**
72+
[LeakIX](https://leakix.net/) is a red-team search engine indexing mis-configurations and vulnerabilities online.
73+
- **Iocextract Analyzer [PR #2426](https://github.com/intelowlproject/IntelOwl/pull/2426) :**
74+
[IocExtract](https://github.com/InQuest/iocextract) package is a library and command line interface (CLI) for extracting URLs, IP addresses, MD5/SHA hashes, email addresses, and YARA rules from text corpora. It allows for you to extract encoded and "defanged" IOCs and optionally decode or refang them.
75+
- **Apivoid Analyzer [PR #2428](https://github.com/intelowlproject/IntelOwl/pull/2428) :**
76+
[ApiVoid](https://www.apivoid.com/) provides JSON APIs useful for cyber threat analysis, threat detection and
77+
threat prevention, reducing and automating the manual work of security analysts.
78+
- **CriminalIp Analyzer [PR #2435](https://github.com/intelowlproject/IntelOwl/pull/2435) :**
79+
[Criminal IP](https://www.criminalip.io/) is an OSINT search engine specialized in attack surface assessment and threat hunting. It offers extensive cyber threat intelligence, including device reputation, geolocation, IP reputation for C2 or scanners, domain safety, malicious link detection, and APT attack vectors via search and API.
80+
- **Criminalip_Scan Analyzer [PR #2438](https://github.com/intelowlproject/IntelOwl/pull/2438)**
81+
CriminalIp_Scan is an implementation of scan APIs provided by [CriminalIp](https://www.criminalip.io/) specifically for domains.
82+
- **Polyswarm analyzer [PR #2439](https://github.com/intelowlproject/IntelOwl/pull/2439) :**
83+
Scans a file using the [Polyswarm](https://docs.polyswarm.io/) API.
84+
- **PolyswarmObs [PR #2439](https://github.com/intelowlproject/IntelOwl/pull/2439) :**
85+
Scan an observable using [Polyswarm](https://docs.polyswarm.io/) API. Paid plan is required for IP and Domain scans. Hash scan is free.
86+
- **Knock analyzer [PR #2448](https://github.com/intelowlproject/IntelOwl/pull/2448) :**
87+
[Knock](https://github.com/guelfoweb/knock) or Knockpy is a portable and modular python3 tool designed to quickly enumerate subdomains on a target domain through passive reconnaissance and dictionary scan.
88+
- **Improved PE_info analyzer [PR #2464](https://github.com/intelowlproject/IntelOwl/pull/2464) :**
89+
Improve PE_info analyzer; added support for ".NET" files and their info extraction .
90+
- **Droidlysis analyzer [PR #2454](https://github.com/intelowlproject/IntelOwl/pull/2454) :**
91+
[DroidLysis](https://github.com/cryptax/droidlysis) is a pre-analysis tool for Android apps: it performs repetitive and boring tasks we'd typically do at the beginning of any reverse engineering. It disassembles the Android sample, organizes output in directories, and searches for suspicious spots in the code to look at. The output helps the reverse engineer speed up the first few steps of analysis.
92+
- **MobSF Analyzer [PR #2461](https://github.com/intelowlproject/IntelOwl/pull/2461) :**
93+
[Mobsfscan](https://github.com/MobSF/mobsfscan) is a static analysis tool that can find insecure code patterns in your Android and iOS source code. Supports Java, Kotlin, Android XML, Swift and Objective C Code.
94+
- **Apk_artifacts analyzer [PR #2469](https://github.com/intelowlproject/IntelOwl/pull/2469) :**
95+
Apk [artifacts](https://github.com/guelfoweb/artifacts) provides APK strings analysis. It provides analysis, similarity and a report of an apk file.
96+
- **Markdown Features [PR #33](https://github.com/intelowlproject/intelowlproject.github.io/pull/33) :**
97+
Improved markdown support for IntelOwl's blog site.
98+
99+
## Ending Note and Next Steps
100+
101+
GSoC has been a hell of a ride for me. At first glance, implementing a new analyzer seems to be an easy task and, in fact, it is pretty easy. The real challenge starts when one has to develop and test multiple of them in parallel. The current framework for analyzer development is really smooth for one-at-a-time approach but things get really intricate and tricky while working on a handful of them at the same time. Migration issues, dependency management, database integrity are a few topics that scratch the surface. Re-building the project from scratch every-time you switch to develop another analyzer is surely an option but its time taking and to deliver an avg of 3 analyzers per week requires quicker solutions, plus, I'm too impatient for it :P
102+
As a beginner in the tech world, I came across a huge load of challenges as I proceeded with each analyzer in the project. Navigating through unforeseeable bugs, git conflicts, packages becoming unmaintainable, etc helped me grow exponentially as a developer.
103+
All this experience has helped me understand the importance of OSINT in cybersecurity; how my contributions are a tiny but impactful effort in making the world a safer pace.
104+
105+
I'm always eager to work on new ideas and features in this project. I hope that I'm able to make time to contribute more to the project in the future and give back to the community as much as I can.
106+
Thanks to my mentors, Matteo Lodi and Daniel Rosetti for their continuous support and making this GSoC a worthwhile experience, thankyou IntelOwl :)

0 commit comments

Comments
 (0)