|
1514 | 1514 | { |
1515 | 1515 | "data": { |
1516 | 1516 | "text/plain": [ |
1517 | | - "<pandas.io.parsers.readers.TextFileReader at 0x13295f2d0>" |
| 1517 | + "<pandas.io.parsers.readers.TextFileReader at 0x13442aa10>" |
1518 | 1518 | ] |
1519 | 1519 | }, |
1520 | 1520 | "execution_count": 16, |
|
1745 | 1745 | }, |
1746 | 1746 | { |
1747 | 1747 | "cell_type": "markdown", |
1748 | | - "id": "1d72fdfb", |
| 1748 | + "id": "0f9db2a8-7291-4db9-89bc-ef5def432dae", |
1749 | 1749 | "metadata": {}, |
1750 | 1750 | "source": [ |
1751 | 1751 | "## Arbeiten mit dem csv-Modul von Python\n", |
|
1756 | 1756 | { |
1757 | 1757 | "cell_type": "code", |
1758 | 1758 | "execution_count": 25, |
1759 | | - "id": "1207f91c", |
| 1759 | + "id": "d4ed9b30-594c-4e83-a5f2-460b36cb6bab", |
1760 | 1760 | "metadata": {}, |
1761 | 1761 | "outputs": [ |
1762 | 1762 | { |
|
1782 | 1782 | " print(line)" |
1783 | 1783 | ] |
1784 | 1784 | }, |
| 1785 | + { |
| 1786 | + "cell_type": "markdown", |
| 1787 | + "id": "0ed726c4-5e09-4676-bcf0-f78e9f7a10e0", |
| 1788 | + "metadata": {}, |
| 1789 | + "source": [ |
| 1790 | + "Mit [Sniffer.has_header](https://docs.python.org/3/library/csv.html#csv.Sniffer.has_header) wird eure csv-Datei analysiert und gibt ``True`` zurück, wenn die erste Zeile eine Reihe von Spaltenüberschriften zu sein scheint.\n", |
| 1791 | + "\n", |
| 1792 | + "<div class=\"alert alert-block alert-info\">\n", |
| 1793 | + "\n", |
| 1794 | + "**Bemerkung:**\n", |
| 1795 | + "\n", |
| 1796 | + "Diese Methode ist nur eine grobe Heuristik und kann sowohl falsch-positive als auch falsch-negative Ergebnisse liefern.\n", |
| 1797 | + "</div>" |
| 1798 | + ] |
| 1799 | + }, |
| 1800 | + { |
| 1801 | + "cell_type": "markdown", |
| 1802 | + "id": "a19c05c1-e947-471b-8089-8e36e65b4268", |
| 1803 | + "metadata": {}, |
| 1804 | + "source": [ |
| 1805 | + "Auch [Sniffer.sniff](https://docs.python.org/3/library/csv.html#csv.Sniffer.sniff) analysiert eure csv-Datei, gibt aber eine der folgenden Dialekt-Unterklassen zurück." |
| 1806 | + ] |
| 1807 | + }, |
| 1808 | + { |
| 1809 | + "cell_type": "code", |
| 1810 | + "execution_count": 26, |
| 1811 | + "id": "263a8cb4-4ae1-46f0-963f-9d2df2de45ed", |
| 1812 | + "metadata": {}, |
| 1813 | + "outputs": [ |
| 1814 | + { |
| 1815 | + "name": "stdout", |
| 1816 | + "output_type": "stream", |
| 1817 | + "text": [ |
| 1818 | + "['', 'Titel', 'Sprache', 'Autor*innen', 'Lizenz', 'Veröffentlichungsdatum', 'doi']\n", |
| 1819 | + "['0', 'Python basics', 'en', 'Veit Schiele', '', '2021-10-28', '']\n", |
| 1820 | + "['1', 'Jupyter Tutorial', 'en', 'Veit Schiele', '', '2019-06-27', '']\n", |
| 1821 | + "['2', 'Jupyter Tutorial', 'de', 'Veit Schiele', '', '2020-10-26', '']\n", |
| 1822 | + "['3', 'PyViz Tutorial', 'en', 'Veit Schiele', '', '2020-04-13', '']\n" |
| 1823 | + ] |
| 1824 | + } |
| 1825 | + ], |
| 1826 | + "source": [ |
| 1827 | + "with open('out.csv') as f:\n", |
| 1828 | + " dialect = csv.Sniffer().sniff(f.read(1024))\n", |
| 1829 | + " f.seek(0)\n", |
| 1830 | + " reader = csv.reader(f, dialect)\n", |
| 1831 | + "\n", |
| 1832 | + " for line in reader:\n", |
| 1833 | + " print(line)" |
| 1834 | + ] |
| 1835 | + }, |
1785 | 1836 | { |
1786 | 1837 | "cell_type": "markdown", |
1787 | 1838 | "id": "e70392b5", |
|
1791 | 1842 | "\n", |
1792 | 1843 | "csv-Dateien gibt es in vielen verschiedenen Varianten. Das Python csv-Modul kommt bereits mit drei verschiedenen Dialekten:\n", |
1793 | 1844 | "\n", |
1794 | | - "Parameter | excel | excel-tab | unix\n", |
| 1845 | + "Parameter | [excel](https://docs.python.org/3/library/csv.html#csv.excel) | [excel-tab](https://docs.python.org/3/library/csv.html#csv.excel_tab) | [unix](https://docs.python.org/3/library/csv.html#csv.unix_dialect)\n", |
1795 | 1846 | ":--- | :--- | :--- | :--- \n", |
1796 | 1847 | "`delimiter` | `','` | `'\\t'` | `','` |\n", |
1797 | 1848 | "`quotechar` | `'\"'` | `'\"'` | ` '\"'` |\n", |
|
1816 | 1867 | }, |
1817 | 1868 | { |
1818 | 1869 | "cell_type": "code", |
1819 | | - "execution_count": 26, |
| 1870 | + "execution_count": 27, |
1820 | 1871 | "id": "8d765adf", |
1821 | 1872 | "metadata": {}, |
1822 | 1873 | "outputs": [], |
|
1840 | 1891 | }, |
1841 | 1892 | { |
1842 | 1893 | "cell_type": "code", |
1843 | | - "execution_count": 27, |
| 1894 | + "execution_count": 28, |
1844 | 1895 | "id": "69fff7dd", |
1845 | 1896 | "metadata": {}, |
1846 | 1897 | "outputs": [ |
|
1873 | 1924 | }, |
1874 | 1925 | { |
1875 | 1926 | "cell_type": "code", |
1876 | | - "execution_count": 28, |
| 1927 | + "execution_count": 29, |
1877 | 1928 | "id": "e9c0a9c2", |
1878 | 1929 | "metadata": {}, |
1879 | 1930 | "outputs": [ |
|
1898 | 1949 | " 'doi': ('', '', '', '')}" |
1899 | 1950 | ] |
1900 | 1951 | }, |
1901 | | - "execution_count": 28, |
| 1952 | + "execution_count": 29, |
1902 | 1953 | "metadata": {}, |
1903 | 1954 | "output_type": "execute_result" |
1904 | 1955 | } |
|
1923 | 1974 | }, |
1924 | 1975 | { |
1925 | 1976 | "cell_type": "code", |
1926 | | - "execution_count": 29, |
| 1977 | + "execution_count": 30, |
1927 | 1978 | "id": "5a43af52", |
1928 | 1979 | "metadata": {}, |
1929 | 1980 | "outputs": [], |
|
1937 | 1988 | }, |
1938 | 1989 | { |
1939 | 1990 | "cell_type": "code", |
1940 | | - "execution_count": 30, |
| 1991 | + "execution_count": 31, |
1941 | 1992 | "id": "a65c4cef", |
1942 | 1993 | "metadata": {}, |
1943 | 1994 | "outputs": [ |
|
1949 | 2000 | " '2,Jupyter Tutorial,en,Veit Schiele\\n']" |
1950 | 2001 | ] |
1951 | 2002 | }, |
1952 | | - "execution_count": 30, |
| 2003 | + "execution_count": 31, |
1953 | 2004 | "metadata": {}, |
1954 | 2005 | "output_type": "execute_result" |
1955 | 2006 | } |
|
1975 | 2026 | "name": "python", |
1976 | 2027 | "nbconvert_exporter": "python", |
1977 | 2028 | "pygments_lexer": "ipython3", |
1978 | | - "version": "3.11.4" |
| 2029 | + "version": "3.11.10" |
1979 | 2030 | }, |
1980 | 2031 | "widgets": { |
1981 | 2032 | "application/vnd.jupyter.widget-state+json": { |
|
0 commit comments