-
Notifications
You must be signed in to change notification settings - Fork 39
Open
Labels
Description
html2json에 을 돌리다 에러가 나서 뭔일인가 보니
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "/home/ubuntu/crawlers/bills/specific/html2json.py", line 242, in parse_page
d = extract_specifics(assembly_id, bill_id, meta)
File "/home/ubuntu/crawlers/bills/specific/html2json.py", line 166, in extract_specifics
table = utils.get_elems(page, X['spec_table'])[1]
IndexError: list index out of range
<Greenlet at 0x7f27e79417d0: parse_page(19, '1900145', bill_id status , u'./json/19')> failed with IndexError
sources/specifics/19/1900145.html 파일을 받을때 오류가 발생한것 같습니다.
^M
^M
^M
<SCRIPT LANGUAGE="javascript">^M
<!--^M
function onLoad() {^M
alert(document.all["MSG"].innerText);^M
}^M
-->^M
</SCRIPT>^M
^M
^M
^M
<HTML>^M
<BODY ONLOAD="javascript:onLoad()">^M
<TEXTAREA ID="MSG" STYLE="display:none">[SQLException] Code[24757] Msg[ORA-24757: Æ®·£Àè¼Ç ½Äº°ÀÚ°¡ Áߺ¹µÇ¾ú½À´Ï´Ù
ORA-02063: line°¡ ¼±ÇàµÊ (NALAW_LINK·Î ºÎÅÍ)
][µ¥ÀÌÅͺ£À̽º ¿À·ù]</TEXTAREA> ^M
</BODY>^M
</HTML>
이런 경우 어떻게 하면 될까요? SQL Exception이 나왔는데 이런경우 crawler에서 다시 받아 오기 기능이 필요할듯 합니다.
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.
Reactions are currently unavailable