从容器、可迭代对象谈起
所有的容器都是可迭代的(iterable),迭代器提供了一个next方法。iter()返回一个迭代器,通过next()函数可以实现遍历。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
def is_iterable(param): try : iter (param) return True except TypeError: return False params = [ 1234 , '1234' , [ 1 , 2 , 3 , 4 ], set ([ 1 , 2 , 3 , 4 ]), { 1 : 1 , 2 : 2 , 3 : 3 , 4 : 4 }, ( 1 , 2 , 3 , 4 ) ] for param in params: print ( '{} is iterable? {}' . format (param, is_iterable(param))) ########## 输出 ########## # 1234 is iterable? False # 1234 is iterable? True # [1, 2, 3, 4] is iterable? True # {1, 2, 3, 4} is iterable? True # {1: 1, 2: 2, 3: 3, 4: 4} is iterable? True # (1, 2, 3, 4) is iterable? True |
除了数字外,其他数据结构都是可迭代的。
生成器是什么
生成器是懒人版本的迭代器。例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
import os import psutil #显示当前 python 程序占用的内存大小 def show_memory_info(hint): pid = os.getpid() p = psutil.Process(pid) info = p.memory_full_info() memory = info.uss / 1024. / 1024 print ( '{} memory used: {} MB' . format (hint, memory)) def test_iterator(): show_memory_info( 'initing iterator' ) list_1 = [i for i in range ( 100000000 )] show_memory_info( 'after iterator initiated' ) print ( sum (list_1)) show_memory_info( 'after sum called' ) def test_generator(): show_memory_info( 'initing generator' ) list_2 = (i for i in range ( 100000000 )) show_memory_info( 'after generator initiated' ) print ( sum (list_2)) show_memory_info( 'after sum called' ) test_iterator() test_generator() % time test_iterator() % time test_generator() ######### 输出 ########## initing iterator memory used: 48.9765625 MB after iterator initiated memory used: 3920.30078125 MB 4999999950000000 after sum called memory used: 3920.3046875 MB Wall time: 17 s initing generator memory used: 50.359375 MB after generator initiated memory used: 50.359375 MB 4999999950000000 after sum called memory used: 50.109375 MB Wall time: 12.5 s |
[i for i in range(100000000)] 声明了一个迭代器,每个元素在生成后都会保存到内存中,占用了巨量的内存。(i for i in range(100000000)) 初始化了一个生成器,可以看到,生成器并不会像迭代器一样占用大量的内存,相比于 test_iterator(),test_generator()函数节省了一次生成一亿个元素的过程。在调用next()的时候,才会生成下一个变量.
生成器能玩啥花样
数学中有一个恒等式,(1 + 2 + 3 + ... + n)^2 = 1^3 + 2^3 + 3^3 + ... + n^3,用以下代码表达
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
def generator(k): i = 1 while True : yield i * * k i + = 1 gen_1 = generator( 1 ) gen_3 = generator( 3 ) print (gen_1) print (gen_3) def get_sum(n): sum_1, sum_3 = 0 , 0 for i in range (n): next_1 = next (gen_1) next_3 = next (gen_3) print ( 'next_1 = {}, next_3 = {}' . format (next_1, next_3)) sum_1 + = next_1 sum_3 + = next_3 print (sum_1 * sum_1, sum_3) get_sum( 8 ) ########## 输出 ########## # <generator object generator at 0x000001E70651C4F8> # <generator object generator at 0x000001E70651C390> # next_1 = 1, next_3 = 1 # next_1 = 2, next_3 = 8 # next_1 = 3, next_3 = 27 # next_1 = 4, next_3 = 64 # next_1 = 5, next_3 = 125 # next_1 = 6, next_3 = 216 # next_1 = 7, next_3 = 343 # next_1 = 8, next_3 = 512 # 1296 1296 |
generator()这个函数,它返回了一个生成器,当运行到yield i ** k时,暂停并把i ** k作为next()的返回值。每次调用next(gen)时,暂停的程序会启动并往下执行,而且i的值也会被记住,继续累加,最后next_1为8,next_3为512.
仔细查看这个示例,发现迭代器是一个有限集合,生成器则可以成为一个无限集。调用next(),生成器根据运算会自动生成新的元素,然后返回给你,非常便捷。
再来看一个问题:给定一个list和一个指定数字,求这个数字在list中的位置:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
#常规写法 def index_normal(L, target): result = [] for i, num in enumerate (L): if num = = target: result.append(i) return result print (index_normal([ 1 , 6 , 2 , 4 , 5 , 2 , 8 , 6 , 3 , 2 ], 2 )) ########## 输出 ########## [ 2 , 5 , 9 ] #生成器写法 def index_generator(L, target): for i, num in enumerate (L): if num = = target: yield i print ( list (index_generator([ 1 , 6 , 2 , 4 , 5 , 2 , 8 , 6 , 3 , 2 ], 2 ))) ######### 输出 ########## [ 2 , 5 , 9 ] |
再看一例子:
查找子序列:给定两个字符串a,b,查找字符串a是否字符串b的子序列,所谓子序列,即一个序列包含在另一个序列中并且顺序一
算法:分别用两个指针指向两个字符串的头,然后往后移动找出相同的值,如果其中一个指针走完了整个字符串也没有相同的值,则不是子序列
1
2
3
4
5
6
7
8
|
def is_subsequence(a, b): b = iter (b) return all (i in b for i in a) print (is_subsequence([ 1 , 3 , 5 ], [ 1 , 2 , 3 , 4 , 5 ])) print (is_subsequence([ 1 , 4 , 3 ], [ 1 , 2 , 3 , 4 , 5 ])) ######### 输出 ########## True False |
下面代码为上面代码的演化版本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
def is_subsequence(a, b): b = iter (b) print (b) gen = (i for i in a) print (gen) for i in gen: print (i) gen = ((i in b) for i in a) print (gen) for i in gen: print (i) return all (((i in b) for i in a)) print (is_subsequence([ 1 , 3 , 5 ], [ 1 , 2 , 3 , 4 , 5 ])) print (is_subsequence([ 1 , 4 , 3 ], [ 1 , 2 , 3 , 4 , 5 ])) ########## 输出 ########## # <list_iterator object at 0x000001E7063D0E80> # <generator object is_subsequence.<locals>.<genexpr> at 0x000001E70651C570> # 1 # 3 # 5 # <generator object is_subsequence.<locals>.<genexpr> at 0x000001E70651C5E8> # True # True # True # False # <list_iterator object at 0x000001E7063D0D30> # <generator object is_subsequence.<locals>.<genexpr> at 0x000001E70651C5E8> # 1 # 4 # 3 # <generator object is_subsequence.<locals>.<genexpr> at 0x000001E70651C570> # True # True # False # False |
首先iter(b)把b转为迭代器。目的是内部实现next函数,(i for i in a) 会产生一个生成器 ,同样((i in b) for i in a)也是。然后(i in b)等阶于:
1
2
3
4
|
while True : val = next (b) if val = = i: yield True |
这里非常巧妙地利用生成器的特性,next()函数运行的时候,保存了当前的指针。比如下面这个示例
1
2
3
4
5
6
7
8
|
b = (i for i in range ( 5 )) print ( 2 in b) print ( 4 in b) print ( 3 in b) ########## 输出 ########## True True False |
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持服务器之家。
原文链接:https://www.cnblogs.com/xiaoguanqiu/p/11099603.html