Bill 1900145 Parsing Error Due to Crawling Error

html2json에 을 돌리다 에러가 나서 뭔일인가 보니

```
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/home/ubuntu/crawlers/bills/specific/html2json.py", line 242, in parse_page
    d = extract_specifics(assembly_id, bill_id, meta)
  File "/home/ubuntu/crawlers/bills/specific/html2json.py", line 166, in extract_specifics
    table       = utils.get_elems(page, X['spec_table'])[1]
IndexError: list index out of range
<Greenlet at 0x7f27e79417d0: parse_page(19, '1900145',        bill_id  status                            , u'./json/19')> failed with IndexError
```

sources/specifics/19/1900145.html 파일을 받을때 오류가 발생한것 같습니다.

```
^M
^M
^M
<SCRIPT LANGUAGE="javascript">^M
^M
</SCRIPT>^M
^M
^M
^M
<HTML>^M
<BODY ONLOAD="javascript:onLoad()">^M
        <TEXTAREA ID="MSG" STYLE="display:none">[SQLException] Code[24757] Msg[ORA-24757: Æ®·£Àè¼Ç ½Äº°ÀÚ°¡ Áßº¹µÇ¾ú½À´Ï´Ù
ORA-02063: line°¡ ¼±ÇàµÊ (NALAW_LINK·Î ºÎÅÍ)
][µ¥ÀÌÅÍº£ÀÌ½º ¿À·ù]</TEXTAREA> ^M
</BODY>^M
</HTML>
```

이런 경우 어떻게 하면 될까요? SQL Exception이 나왔는데 이런경우 crawler에서 다시 받아 오기 기능이 필요할듯 합니다.
## 


<bountysource-plugin>

---
Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/28765479-bill-1900145-parsing-error-due-to-crawling-error?utm_campaign=plugin&utm_content=tracker%2F248104&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F248104&utm_medium=issues&utm_source=github).
</bountysource-plugin>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bill 1900145 Parsing Error Due to Crawling Error #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bill 1900145 Parsing Error Due to Crawling Error #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions