Skip to content

Commit 17cb3ba

Browse files
author
xhebraj
committed
Added resources
1 parent e95975d commit 17cb3ba

File tree

10 files changed

+258
-6
lines changed

10 files changed

+258
-6
lines changed

hw1/README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Data Mining Homework 1
2+
3+
The programs expect to find `beers.txt` under `data/`.
4+
5+
In order to launch the code samples make sure to be in the program's
6+
directory (i.e. launch the program `p4.sh` from the `p4/` folder)
7+
8+
9+
### Problem 6
10+
11+
```
12+
usage: p6.py [-h] [-f] [-n N_PROC]
13+
14+
Scrape kijiji.it Informatica/Grafica/Web category
15+
16+
optional arguments:
17+
-h, --help show this help message and exit
18+
-f, --full_desc Download full description
19+
-n N_PROC, --n_proc N_PROC
20+
Number of processes to run
21+
```
22+
23+
24+
```
25+
author: Anxhelo Xhebraj <[email protected]>
26+
```

hw1/data/.gitkeep

Whitespace-only changes.

hw1/homework1-solution.pdf

219 KB
Binary file not shown.

hw1/homework1.pdf

179 KB
Binary file not shown.

hw1/latex_src/house.png

9.52 KB
Loading

hw1/latex_src/main.tex

Lines changed: 227 additions & 0 deletions
Large diffs are not rendered by default.

hw1/p1/simulation.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
#!/usr/bin/env python3
2+
13
from collections import Counter
24
from random import shuffle
35

hw1/p4/p4.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
#! /bin/env bash
1+
#!/usr/bin/env bash
22

33
awk -F '\t' '{arr[$1]++} END{for (a in arr) print arr[a], a}' ../data/beers.txt | sort -n -r | head -n 10 | cut -d' ' -f2-

hw1/p5/p5.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/env python3
1+
#!/usr/bin/env python3
22
import heapq
33

44
beers = {}

hw1/p6/p6.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#! /usr/bin/env python3
1+
#!/usr/bin/env python3
22

33
import argparse
44
from lxml import html
@@ -76,11 +76,8 @@ def crawl_and_save(prid, urls, get_content, f):
7676
n_items += len(content)
7777
bad_req = False
7878
except (Exception, requests.exceptions.ConnectionError):
79-
print("p"+str(prid) + ": Entering sleep")
8079
time.sleep(60)
81-
print("p"+str(prid) + ": Awake")
8280
f.close()
83-
print("p" + str(prid) + ": ", n_items, " Retrieved")
8481
return n_items
8582

8683

0 commit comments

Comments
 (0)