jiangyangcreate
diff --git a/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行.mdx‎
Lines changed: 0 additions & 421 deletions b/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行.mdx‎
Lines changed: 0 additions & 421 deletions
diff --git a/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/asyncio.mdx‎
Lines changed: 66 additions & 0 deletions b/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/asyncio.mdx‎
Lines changed: 66 additions & 0 deletions
diff --git a/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/concurrent.mdx‎
Lines changed: 4 additions & 0 deletions b/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/concurrent.mdx‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/index.md‎
Lines changed: 31 additions & 0 deletions b/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/index.md‎
Lines changed: 31 additions & 0 deletions
diff --git a/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/multiprocessing.mdx‎
Lines changed: 123 additions & 0 deletions b/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/multiprocessing.mdx‎
Lines changed: 123 additions & 0 deletions
diff --git a/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/queue.mdx‎
Lines changed: 4 additions & 0 deletions b/‎docs/docs/选择编程语言/Python/Python模块库/并发与并行/queue.mdx‎
Lines changed: 4 additions & 0 deletions
@@ -0,0 +1,66 @@
+---
+sidebar_position: 3
+title: asyncio
+---
+### asyncio 模块
+
+协程是编写并发代码的库，是构建 IO 密集型和高级结构化网络代码的最佳选择。
+
+例程的运行方式是通过代码主动切换状态并等待处理，因此效率更高，语法也更详细。循环对象需要处于活动状态：创建、设置、提交、等待运行和停止。
+
+例行程序的最佳数量取决于内存使用情况。
+
+asyncio 模块包含了一些工具，用于编写异步代码。
+
+协程的工作原理是事件循环，事件循环是一个无限循环，它等待事件并执行它们。
+
+每次任务会被挂起至事件循环队列中，然后按顺序执行。
+
+await 关键字用于挂起协程，直到它被调用。
+
+async 关键字用于定义协程。
+
+asyncio 模块用于实现异步编程。
+
+[asyncio](https://docs.python.org/zh-cn/3.10/library/asyncio.html?highlight=asyncio#module-asyncio):asyncio Multiprocessing Module Code Documentation
+
+```python showLineNumbers
+import asyncio
+
+class TestA:
+    def __init__(self,loop) -> None:
+        self.loop = loop
+        asyncio.set_event_loop(loop=self.loop) # step 3.1
+
+    async def run_page(self,tid): # step 7
+        print(tid)
+        # 此处编写爬虫代码
+        return tid
+
+    async def close(self,):
+        for i in asyncio.all_tasks(): # step 9.1
+            i.cancel()
+        self.loop.stop() # step  9.2
+
+
+def test():
+    get_async_loop = asyncio.new_event_loop() # step 1
+    asyncio.set_event_loop(get_async_loop) # step 2
+
+    async def spider(task_obj):
+        async_task =  [asyncio.ensure_future(task_obj.run_page(1)),
+                    asyncio.ensure_future(task_obj.run_page(2)),] # step  6
+        await asyncio.wait(async_task) # step  8
+
+        await task_obj.close() # step 9
+
+    task_obj = TestA(get_async_loop) #step 3
+    asyncio.run_coroutine_threadsafe(spider(task_obj), loop=get_async_loop) #step  4
+    get_async_loop.run_forever() # step 5
+
+test()
+```
+
+生成器函数与协程（注：函数）非常相似，它们 yield 多次，它们具有多个入口点，并且它们的执行可以被挂起。唯一的区别是生成器函数不能控制在它在 yield 后交给哪里继续执行，控制权总是转移到生成器的调用者
+
+在 Python 创建协程时，task 是 future 的子类，所以 task 继承了 future 的属性和方法。几乎没有不同。
@@ -0,0 +1,4 @@
+---
+sidebar_position: 5
+title: concurrent
+---
@@ -0,0 +1,31 @@
+---
+sidebar_position: 1
+title: 并发与并行
+---
+
+进程：操作系统分配资源的基本单位。多进程属于并行：在同一时刻同时处理多个任务。
+
+线程：操作系统调度资源的最小单位。多线程属于并发：在一段时间内交替处理多个任务。
+
+协程：你可以把它想象成在一个线程内部，多个任务之间进行协作和切换。协程属于并发：在一段时间内交替处理多个任务。
+
+:::tip
+多进程、多线程的出现，核心目的是为了最大限度地利用中央处理器（CPU）这一关键硬件资源。
+
+多进程通过调动更多的CPU核心，从而提高程序的执行效率。
+
+那为什么多线程没有使用更多的资源，只是交替处理多个任务，就能更快？
+
+以一个常见的爬虫任务为例，其工作流程可以分为两个主要步骤：
+1.  CPU指挥网卡发送网络请求，这一过程往往需要等待远程服务器响应（例如，网络通信耗时1秒）。
+2.  CPU指挥磁盘将接收到的数据写入磁盘，完成持久化存储（例如，磁盘耗时19秒）。
+
+
+* **单线程模式**：在单线程下处理10个网址，程序必须顺序执行。当第一个网址的网络请求发送后，程序会进入**等待状态**，直到数据完全写入磁盘（总耗时20秒）后，才能开始处理第二个网址。因此，处理10个网址的总耗时将是 $10 \times 20$ 秒，即200秒。CPU长时间闲置。
+
+* **多线程模式**：多线程则能显著提高效率。当第一个线程发出网络请求后，CPU不再闲置等待，而是立即切换到第二个线程，发起新的网络请求。这样，CPU可以**在等待I/O操作（如网络响应和磁盘写入）完成的同时，处理其他任务**。通过这种方式，CPU最大程度不空闲，指挥多个网络请求和磁盘写入操作**并行进行**，从而大幅缩短总体的完成时间。
+
+这个切换过程称之为**上下文切换**，会产生一定的开销，由操作系统自动完成，操作系统不一定在最合理的时间点进行上下文切换。
+
+因此为了进一步提高效率，我们使用协程来完成这个任务。在协程中，程序员在代码中编写`await`关键字来完成主动**上下文切换**。
+:::
@@ -0,0 +1,123 @@
+---
+sidebar_position: 1
+title: multiprocessing
+---
+### multiprocessing 模块
+
+进程是系统独立安排和分配系统资源（CPU、内存）的基本单位，操作系统以进程为单位分配存储空间，操作系统管理所有进程的执行，为它们合理的分配资源。
+
+一个进程就是 macOS 中的“活动监视器”、Windows 中的“任务管理器”的一个执行程序。
+
+#### 多进程
+
+进程之间是相互独立的，Python 中的进程通信一般由进程对 Queue 完成。
+
+进程绕过了全局解释器锁。因此，多进程模块允许程序员充分利用特定机器上的多个处理器。它在 Unix 和 Windows 上都能运行。
+
+进程的数量等于 CPU 核心的数量，这是最有效的。如果核数太多，就不能充分利用核数。如果太少，会造成进程切换，增加程序的运行时间。
+
+[multiprocessing](https://docs.python.org/zh-cn/3.10/library/multiprocessing.html?highlight=multiprocessing#module-multiprocessing):Multiprocessing Module Code Documentation
+
+```python showLineNumbers
+from multiprocessing import Pool
+
+def f(vaule):
+    x = vaule[0]
+    y = vaule[1]
+    return x*y
+
+if __name__ == '__main__':
+    p = Pool(16) # new 16 process pools ， because i have 16 cpu
+    print(p.map(f, [(1,1), (2,2), (3,3)])) # take in data
+    p.close() # close pool
+
+# [1, 4, 9]
+```
+
+我们来完成 1~100000000 求和的计算密集型任务，循环解决，暂时也不考虑列表切片操作花费的时间，只是把做运算和合并运算结果的时间统计出来。
+
+```python showLineNumbers
+from time import time
+
+
+def main():
+    total = 0
+    number_list = [x for x in range(1, 100000001)]
+    start = time()
+    for number in number_list:
+        total += number
+    print(total)
+    end = time()
+    print('Execution time: %.3fs' % (end - start))
+
+```
+
+```python showLineNumbers
+main()
+# 5000000050000000
+# Execution time: 6.798s
+```
+
+利用多进程“分而治之”，
+
+当我们将这个任务分解到 8 个进程中去执行：
+
+```python showLineNumbers
+from multiprocessing import Process, Queue
+from time import time
+
+core_num = 8
+
+
+def task_handler(curr_list, result_queue):
+    total = 0
+    for number in curr_list:
+        total += number
+    result_queue.put(total)
+
+
+def main():
+    processes = []
+    number_list = [x for x in range(1, 100000001)]
+    result_queue = Queue()
+    index = 0
+    # 启动core_num(8)个进程将数据切片后进行运算
+    index_batch = int(100000000 / core_num)
+    for _ in range(core_num):
+        p = Process(target=task_handler,
+                    args=(number_list[index:index + index_batch], result_queue))
+        index += index_batch
+        processes.append(p)
+        p.start()
+    # 开始记录所有进程执行完成花费的时间
+    start = time()
+    for p in processes:
+        p.join()
+    # 合并执行结果
+    total = 0
+    while not result_queue.empty():
+        total += result_queue.get()
+    print(total)
+    end = time()
+    print('Execution time: ', (end - start), 's', sep='')
+
+
+if __name__ == '__main__':
+    main()
+
+```
+
+以上代码保存为 multi_process.py
+
+```python showLineNumbers
+!python multi_process.py
+```
+
+```python showLineNumbers
+# 5000000050000000
+# Execution time: 0.7936668395996094s
+```
+
+明显，多进程更快。
+
+使用多进程后由于获得了更多的 CPU 执行时间以及更好的利用了 CPU 的多核特性，明显的减少了程序的执行时间，而且计算量越大效果越明显。
@@ -0,0 +1,4 @@
+---
+sidebar_position: 4
+title: queue
+---
-Original file line number
+Diff line change
@@ @@ -0,0 +1,4 @@ @@
 +---
 +sidebar_position: 5
 +title: concurrent
 +---