Python seeds for random number generators with multiprocessing

From https://stackoverflow.com/questions/29854398/seeding-random-number-generators-in-parallel-programs

If no seed is provided explicitly, numpy.random will seed itself using an OS-dependent source of randomness. Usually it will use /dev/urandom on Unix-based systems (or some Windows equivalent), but if this is not available for some reason then it will seed itself from the wall clock. Since self-seeding occurs at the time when a new subprocess forks, it is possible for multiple subprocesses to inherit the same seed if they forked at the same time, leading to identical random variates being produced by different subprocesses.

Some following texts are reprinted from [Python, NumPy, Pytorch中的多进程中每个进程的随机化种子误区] (https://blog.csdn.net/xiaojiajia007/article/details/90207113) with some modifications.

python自带的random在不同子进程中会生成不同的种子,而numpy.random不同子进程会fork相同的主进程中的种子。 pytorch中的Dataloader类的__getitem__()会在不同子进程中发生不同的torch.seed(),并且种子与多进程的worker id 有关(查看**worker_init_fn参数说明)。但是三者互不影响,必须独立地处理。因此在写自己的数据准备代码时,如果使用了 numpy中的随机化部件,一定要显示地在各个子进程中重新采样随机种子,或者使用python中的random发生随机种子。

Experiments were run on Linux-4.9.125-linuxkit-x86_64-with-Ubuntu-18.04-bionic (indeed, in a docker Virtual Machine) with Python 3.6.8, the system had 4 physical cores with 4 hyperthreads, thus 8 logical cores.

Using numpy.random module, without seeding. Identical random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	return np.random.uniform(0, 10, 4)

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[6.49262187 8.1209342  9.30877977 2.8359707 ]
 [6.49262187 8.1209342  9.30877977 2.8359707 ]
 [6.49262187 8.1209342  9.30877977 2.8359707 ]
 [6.49262187 8.1209342  9.30877977 2.8359707 ]
 [4.0503515  5.22821427 6.1138743  7.91021459]
 [4.0503515  5.22821427 6.1138743  7.91021459]
 [3.86868331 5.71104986 8.63595764 2.10258815]
 [3.86868331 5.71104986 8.63595764 2.10258815]
 [4.09196614 5.38849459 1.88374082 6.27455603]
 [4.0503515  5.22821427 6.1138743  7.91021459]
 [7.11586367 2.66182869 0.25424771 4.46438042]
 [4.0503515  5.22821427 6.1138743  7.91021459]
 [4.09196614 5.38849459 1.88374082 6.27455603]
 [3.86868331 5.71104986 8.63595764 2.10258815]
 [7.11586367 2.66182869 0.25424771 4.46438042]]
 
 Out:
 [[3.0011753  8.94085733 1.26403167 3.33780215]
 [3.0011753  8.94085733 1.26403167 3.33780215]
 [3.0011753  8.94085733 1.26403167 3.33780215]
 [3.0011753  8.94085733 1.26403167 3.33780215]
 [5.21202135 2.77172446 8.99082513 7.37469595]
 [5.21202135 2.77172446 8.99082513 7.37469595]
 [0.74079289 3.58791385 9.92673251 8.81624304]
 [5.21202135 2.77172446 8.99082513 7.37469595]
 [5.77312088 6.18520518 1.70873513 3.25512823]
 [5.21202135 2.77172446 8.99082513 7.37469595]
 [0.74079289 3.58791385 9.92673251 8.81624304]
 [0.74079289 3.58791385 9.92673251 8.81624304]
 [5.77312088 6.18520518 1.70873513 3.25512823]
 [0.74079289 3.58791385 9.92673251 8.81624304]
 [4.69327905 2.21723652 4.15112529 2.68184457]]
 
 Out:
 [[8.42608352 4.26365989 4.23273135 0.98033274]
 [8.42608352 4.26365989 4.23273135 0.98033274]
 [8.42608352 4.26365989 4.23273135 0.98033274]
 [8.42608352 4.26365989 4.23273135 0.98033274]
 [9.60108351 3.66480316 2.29340697 6.08440055]
 [9.60108351 3.66480316 2.29340697 6.08440055]
 [3.24274242 9.52787549 0.46328866 3.53148162]
 [3.24274242 9.52787549 0.46328866 3.53148162]
 [1.74868215 6.11640213 6.69673611 3.43192459]
 [9.60108351 3.66480316 2.29340697 6.08440055]
 [9.60108351 3.66480316 2.29340697 6.08440055]
 [3.24274242 9.52787549 0.46328866 3.53148162]
 [3.24274242 9.52787549 0.46328866 3.53148162]
 [1.74868215 6.11640213 6.69673611 3.43192459]
 [1.74868215 6.11640213 6.69673611 3.43192459]]

Using numpy.random module, seeding with no arguments. Different random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	np.random.seed()
	return np.random.uniform(0, 10, 4)

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[2.72055845 0.8802589  3.79447566 4.35878036]
 [7.07445413 0.94125983 7.35434298 0.50540998]
 [5.95277586 8.03353379 6.59704956 5.53617238]
 [9.84429197 1.59257447 1.77645623 3.78680617]
 [2.75307488 8.29017692 5.2099913  4.43252387]
 [5.38033998 9.47567343 4.90971625 9.84378286]
 [0.72352701 2.15117972 7.62379999 6.78319677]
 [4.51341569 6.5602688  4.44566849 8.48052612]
 [5.04389035 6.71856689 6.41253087 7.52004488]
 [4.20345455 5.20049146 8.0568088  7.23299742]
 [7.19610857 7.88455016 2.98487976 4.33196014]
 [6.22188102 1.66534889 2.22853642 1.2613593 ]
 [0.74199402 5.64979461 1.704669   4.86116781]
 [6.49103534 9.25281956 9.56482262 6.57539205]
 [7.51085919 7.7772543  4.46449279 0.03745542]]
 
 Out:
 [[9.39494869 4.51063974 1.61137229 2.60741085]
 [3.36653946 2.26678542 8.05428869 7.19358866]
 [2.12750234 1.69631663 1.72097737 0.80840712]
 [8.88247029 1.71885449 1.38022186 3.9497802 ]
 [4.09138763 3.12106515 6.76802096 8.58614772]
 [7.68232307 8.04693359 4.3367779  8.03027598]
 [1.05906508 4.52861658 9.36029798 1.81410039]
 [5.67021807 8.6833277  3.90648695 8.05836433]
 [5.24232829 5.46656855 3.67320429 7.95415452]
 [6.44284184 6.60178372 0.34659434 4.84729987]
 [2.53164432 9.1651901  3.23400545 9.39691859]
 [1.73036203 3.73673368 4.4327516  5.03388905]
 [7.79328932 2.68597964 8.34646328 8.43474408]
 [8.68258261 2.17114809 4.48464149 1.91976047]
 [5.15085054 8.62400053 2.16302764 8.45979093]]
 
 Out:
 [[1.65105989 5.34805454 4.20808944 5.86171254]
 [0.29864045 4.27875838 5.47215759 4.36884446]
 [7.35232009 8.40542424 6.12664336 5.82047388]
 [7.56993499 3.46371231 3.8359816  3.17833574]
 [0.3528505  4.55242452 0.76885988 4.15433463]
 [2.96784778 5.42788002 0.10388263 5.14563438]
 [5.17591987 2.51516106 5.31085603 9.16870908]
 [4.71439683 3.86082685 0.47858268 1.77623806]
 [8.05398488 1.36262726 0.77466243 9.01735709]
 [6.7966317  0.836589   3.11442613 7.24553407]
 [5.28787898 8.78236011 9.12632954 0.44383284]
 [7.81086849 1.79341761 5.2370905  3.24437723]
 [4.15437249 2.86691807 2.49408633 0.62242588]
 [1.13911367 8.81219785 6.42852335 0.24591118]
 [3.01810029 0.02716625 8.65552876 7.40824001]]

Using numpy.random module, seeding with None. Different random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	np.random.seed(None)
	return np.random.uniform(0, 10, 4)

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[5.66061069 1.15537753 0.72941469 7.59278288]
 [7.01785468 8.79189516 5.4073534  8.2773378 ]
 [8.64154707 6.46188201 2.32864718 6.07402244]
 [3.99662848 2.78765117 7.78397263 9.37711745]
 [0.59342934 9.84564758 6.42627522 8.51653643]
 [4.81447643 6.68004144 3.771518   1.69121905]
 [5.28963199 4.32796107 4.62243399 8.53461067]
 [5.7261191  5.75896923 1.2006108  3.05607401]
 [3.98921882 2.52606706 7.53328592 6.40883981]
 [0.39736223 8.00944951 1.10709433 9.25880692]
 [7.96486569 9.95799148 3.46257317 8.41806039]
 [5.06161476 8.74233197 5.96415057 1.84453155]
 [5.22873857 7.46636046 4.01752842 7.89652396]
 [1.48290741 5.08099786 3.24118862 7.34695478]
 [6.1168209  2.50699166 8.99069415 5.02639108]]
 
 Out:
 [[9.40498577 3.98684269 1.41401997 0.44949128]
 [2.7772966  9.00474825 2.82305118 1.90224813]
 [4.41714032 2.21045351 3.65521197 6.94542113]
 [5.47109984 3.70506815 5.61401534 0.3171225 ]
 [8.86214157 4.80788473 0.85390536 8.13337085]
 [0.29973757 3.24548397 8.23583838 7.92725853]
 [6.70596311 7.74843635 6.73633974 9.58695382]
 [1.15201196 4.53826705 5.95239931 9.87546877]
 [2.16988653 3.83007323 6.94375843 6.63441491]
 [9.43285395 0.8964209  9.05141932 9.52343054]
 [4.56217306 9.53677687 2.18906585 9.22649128]
 [0.78882755 0.30431723 9.15167895 1.53454976]
 [3.83877105 4.35934966 8.15622041 4.26282909]
 [7.75727846 7.02708933 7.81748397 0.93551887]
 [8.37046773 4.16983671 4.66616811 7.31948163]]
 
 Out:
 [[5.37984038 7.66643394 1.44751028 6.08207063]
 [0.62307939 0.0381226  9.64150795 6.2145749 ]
 [9.66367544 8.87438801 7.47616606 9.1984564 ]
 [5.85743256 7.95966147 5.21179431 1.10947049]
 [4.26549026 7.34090569 6.23851296 5.11757473]
 [8.88857647 4.63363418 6.63604227 0.29794179]
 [4.68894521 0.36940943 0.58469257 5.19456297]
 [7.98095094 2.86015854 5.80023412 6.69828148]
 [6.46117453 8.18519272 9.91373532 2.91021242]
 [6.3647256  0.97542724 4.96531842 4.08462095]
 [0.58053755 7.41943139 2.12317777 2.01503869]
 [1.14085398 4.96704605 6.6241862  2.77557808]
 [1.66571531 7.51319418 0.18487467 7.46651576]
 [1.79253752 8.69946821 7.36189555 7.41694284]
 [5.88033799 3.53234725 6.13205727 3.82982333]]

Using numpy.random.RandomState function, seeding with no arguments. Different random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	local_rng = np.random.RandomState()
	return local_rng.uniform(0, 10, 4)

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[7.47287954 1.64513499 5.59878623 0.72030859]
 [2.90500228 1.5166515  6.19064658 1.81021106]
 [3.49132686 5.00867363 1.88135732 1.43919645]
 [2.81794232 0.71414477 6.54793177 4.39803013]
 [9.54589269 2.5366685  8.28472206 4.24108638]
 [1.62176156 4.92360789 3.08844423 4.89328079]
 [3.41789022 5.57786777 1.50233856 4.27730662]
 [8.15775924 8.36672714 0.22283973 6.14913015]
 [4.85522657 1.11077142 2.35228615 5.40514862]
 [7.88593532 9.47651053 2.75304225 5.49017707]
 [7.66411174 0.47420446 2.68583698 1.90588588]
 [5.9533508  0.48065235 4.28996282 8.73984157]
 [3.02155658 5.89818692 5.92691295 4.5919441 ]
 [7.61758339 2.86797121 3.90167418 2.14272868]
 [6.26586922 9.5772994  5.0617227  9.99122278]]
 
 Out:
 [[6.99686852 0.21068811 5.40786144 5.2242592 ]
 [8.15711185 2.17645545 3.37533804 0.74046103]
 [6.71161056 7.08232828 3.40270084 8.51235743]
 [7.1352259  6.96005528 9.02192899 9.5397134 ]
 [4.28600392 1.58834436 1.66532127 2.28879876]
 [2.79706639 8.37018059 5.35126524 0.04243214]
 [0.73511349 9.48183283 3.11555672 7.64746782]
 [7.30640946 1.61987    9.50893433 6.93870779]
 [7.61393021 6.82058752 9.02091704 0.05923267]
 [7.37401727 5.54675581 5.77661225 2.17095416]
 [2.59241671 3.406191   3.24546694 7.6233631 ]
 [8.56201193 1.5353022  7.32492813 7.87631658]
 [8.05848537 6.10073285 0.78214471 9.02082671]
 [2.68440308 8.97491905 5.9364795  5.47908633]
 [0.1401708  7.82240792 1.25191982 4.93649086]]
 
 Out:
 [[4.4837213  0.41508092 8.4559796  0.76768878]
 [9.19092941 2.94394276 6.56804776 5.00689786]
 [5.16994826 5.99255444 0.48031827 5.44022494]
 [4.46061986 2.15149061 5.77681651 4.82017194]
 [9.93193371 5.91830564 1.0902706  0.18613093]
 [1.54586658 7.88407961 3.25817118 5.49232203]
 [6.06380906 7.49791775 8.60978079 3.39300875]
 [7.12861246 6.6045747  1.991714   7.4517933 ]
 [7.5530726  4.08610184 6.58530841 6.97248608]
 [7.00577556 2.96282761 1.15704372 5.14490299]
 [7.55449023 4.566907   8.33096432 7.03761284]
 [2.96287954 8.76192893 3.54855811 8.69735099]
 [4.47085146 3.06452311 9.85123796 9.42718963]
 [1.31808294 0.51110064 3.9614601  1.53873374]
 [1.76447181 3.06410052 5.47611543 6.21988505]]

Using numpy.random.RandomState function, seeding with None. Different random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	local_rng = np.random.RandomState(None)
	return local_rng.uniform(0, 10, 4)

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[4.07408618 1.12597621 8.44412093 9.47768117]
 [8.31425196 5.16357479 2.57453859 7.60239709]
 [6.49322567 6.5871164  6.68840226 0.39065023]
 [4.29845063 4.91918739 6.67114227 3.67376173]
 [4.59138881 3.58527338 8.50141979 7.65785301]
 [8.06624153 3.31264488 8.42976967 6.95372882]
 [2.83546276 1.37710099 8.60106327 7.80681018]
 [1.25145727 5.18709428 2.58135664 0.90252444]
 [2.41142332 2.17311449 3.78452276 3.48049633]
 [2.95672863 4.60301463 7.03753239 6.78597938]
 [8.6664376  2.21796982 6.58946684 0.94534791]
 [8.36884194 2.04559324 4.72339016 8.56933231]
 [8.97105727 5.87126068 2.56020191 7.77482763]
 [9.3277643  3.43262419 6.99726298 9.72795273]
 [6.92330233 2.9847431  2.47027815 0.59152973]]
 
 Out:
 [[1.18008389 0.43223068 4.56589216 8.99434438]
 [2.99550761 5.63144958 1.67646676 0.92666382]
 [3.83009104 0.84242862 5.81445459 5.26410216]
 [3.90910634 2.75006498 9.72590854 2.09122993]
 [9.44506222 8.92497833 7.32124599 8.37280358]
 [8.67144413 2.78254613 4.52192337 9.98208005]
 [8.90881731 5.39467214 1.48090512 4.98031005]
 [9.56428628 5.21475564 5.92263732 3.75448466]
 [0.58264744 3.61762958 0.52948766 3.94044002]
 [1.61499444 1.89659109 4.13607624 9.00829716]
 [8.64408094 4.01001844 5.06963917 2.80810151]
 [3.88477879 1.53160203 7.67241388 1.18477263]
 [9.11289943 2.24493317 1.36399952 1.53810384]
 [0.58052752 8.02823927 3.54695805 5.06782047]
 [1.90818018 5.86118272 6.21231476 6.57730534]]
 
 Out:
 [[0.96447675 6.14939053 4.16792957 6.79878741]
 [1.95570024 7.78521459 1.15436798 6.61917479]
 [4.94226675 2.46316186 3.08293384 7.48405922]
 [3.5823681  7.29659    3.42391925 5.97202459]
 [5.62471571 0.39652861 2.07557334 9.11455541]
 [0.85831096 9.12297135 0.96402074 3.25984691]
 [0.53098281 7.75401869 7.94618052 6.54585171]
 [3.85699774 5.54857294 5.99321064 3.69811617]
 [5.90936257 5.98547035 7.62473906 4.45725101]
 [4.16769542 3.8463279  4.60334438 5.3734445 ]
 [0.20073633 2.52003127 0.46778973 8.90463061]
 [8.93925295 6.89342956 6.34603345 2.51997292]
 [8.94522954 8.31539651 8.07451252 4.20648038]
 [8.08665292 0.66729955 9.52094181 9.75303898]
 [0.43693905 0.52897897 4.55981209 6.21014578]]

Calling np.random.seed() within a subprocess forces the thread-local RNG (Random Number Generator) instance to seed itself again from /dev/urandom or the wall clock, which will (probably) prevent you from seeing identical output from multiple subprocesses. Best practice is to explicitly pass a different seed (or numpy.random.RandomState instance) to each subprocess.

Using numpy.random.RandomState function, seeding with different seeds explicitly passed to subprocesses. Different random sequences across subprocesses, experiment is reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	local_rng = np.random.RandomState(seed)
	return local_rng.uniform(0, 10, 4)

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[5.48813504e+00 7.15189366e+00 6.02763376e+00 5.44883183e+00]
 [4.17022005e+00 7.20324493e+00 1.14374817e-03 3.02332573e+00]
 [4.35994902e+00 2.59262318e-01 5.49662478e+00 4.35322393e+00]
 [5.50797903e+00 7.08147823e+00 2.90904739e+00 5.10827605e+00]
 [9.67029839e+00 5.47232249e+00 9.72684360e+00 7.14815994e+00]
 [2.21993171e+00 8.70732306e+00 2.06719155e+00 9.18610908e+00]
 [8.92860151e+00 3.31979805e+00 8.21229123e+00 4.16966257e-01]
 [7.63082894e-01 7.79918792e+00 4.38409231e+00 7.23465178e+00]
 [8.73429403e+00 9.68540663e+00 8.69194540e+00 5.30855692e+00]
 [1.03741539e-01 5.01874592e+00 4.95773293e+00 1.33829529e+00]
 [7.71320643e+00 2.07519494e-01 6.33648235e+00 7.48803883e+00]
 [1.80269689e+00 1.94752415e-01 4.63218526e+00 7.24933929e+00]
 [1.54162842e+00 7.40049697e+00 2.63315015e+00 5.33739393e+00]
 [7.77702411e+00 2.37541220e+00 8.24278533e+00 9.65749198e+00]
 [5.13943344e+00 7.73165052e+00 8.70427686e+00 8.04694853e-02]]
 
 Out:
[[5.48813504e+00 7.15189366e+00 6.02763376e+00 5.44883183e+00]
 [4.17022005e+00 7.20324493e+00 1.14374817e-03 3.02332573e+00]
 [4.35994902e+00 2.59262318e-01 5.49662478e+00 4.35322393e+00]
 [5.50797903e+00 7.08147823e+00 2.90904739e+00 5.10827605e+00]
 [9.67029839e+00 5.47232249e+00 9.72684360e+00 7.14815994e+00]
 [2.21993171e+00 8.70732306e+00 2.06719155e+00 9.18610908e+00]
 [8.92860151e+00 3.31979805e+00 8.21229123e+00 4.16966257e-01]
 [7.63082894e-01 7.79918792e+00 4.38409231e+00 7.23465178e+00]
 [8.73429403e+00 9.68540663e+00 8.69194540e+00 5.30855692e+00]
 [1.03741539e-01 5.01874592e+00 4.95773293e+00 1.33829529e+00]
 [7.71320643e+00 2.07519494e-01 6.33648235e+00 7.48803883e+00]
 [1.80269689e+00 1.94752415e-01 4.63218526e+00 7.24933929e+00]
 [1.54162842e+00 7.40049697e+00 2.63315015e+00 5.33739393e+00]
 [7.77702411e+00 2.37541220e+00 8.24278533e+00 9.65749198e+00]
 [5.13943344e+00 7.73165052e+00 8.70427686e+00 8.04694853e-02]]
 
 Out:
[[5.48813504e+00 7.15189366e+00 6.02763376e+00 5.44883183e+00]
 [4.17022005e+00 7.20324493e+00 1.14374817e-03 3.02332573e+00]
 [4.35994902e+00 2.59262318e-01 5.49662478e+00 4.35322393e+00]
 [5.50797903e+00 7.08147823e+00 2.90904739e+00 5.10827605e+00]
 [9.67029839e+00 5.47232249e+00 9.72684360e+00 7.14815994e+00]
 [2.21993171e+00 8.70732306e+00 2.06719155e+00 9.18610908e+00]
 [8.92860151e+00 3.31979805e+00 8.21229123e+00 4.16966257e-01]
 [7.63082894e-01 7.79918792e+00 4.38409231e+00 7.23465178e+00]
 [8.73429403e+00 9.68540663e+00 8.69194540e+00 5.30855692e+00]
 [1.03741539e-01 5.01874592e+00 4.95773293e+00 1.33829529e+00]
 [7.71320643e+00 2.07519494e-01 6.33648235e+00 7.48803883e+00]
 [1.80269689e+00 1.94752415e-01 4.63218526e+00 7.24933929e+00]
 [1.54162842e+00 7.40049697e+00 2.63315015e+00 5.33739393e+00]
 [7.77702411e+00 2.37541220e+00 8.24278533e+00 9.65749198e+00]
 [5.13943344e+00 7.73165052e+00 8.70427686e+00 8.04694853e-02]]

Using Python’s default random module, without seeding. Different random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	res = []
	for _ in range(4):
		res.append(random.uniform(0, 10))
	return res

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[8.75887344 7.66899898 7.89483475 8.25558853]
 [4.82128881 5.82916386 0.34424901 7.2449423 ]
 [1.97611274 2.53168446 9.00998775 8.70417685]
 [5.14779498 5.36060047 5.8851685  4.80023303]
 [4.65710235 9.25872994 8.62409027 1.41524632]
 [3.51892985 4.07925835 7.68290768 6.05086268]
 [3.02449299 3.03966278 5.02132762 2.74329596]
 [8.33139616 9.03531729 0.99170685 3.15345865]
 [6.43301109 2.69636752 0.96816997 1.46783853]
 [4.86883469 0.14235119 7.13505221 1.10815525]
 [8.15082917 0.58924962 8.5408504  3.19921991]
 [0.26685168 1.95267462 6.83847924 7.4683847 ]
 [1.55368032 7.66428747 0.4142539  8.62424906]
 [4.6734337  2.86129616 5.9863453  4.24575983]
 [6.47002111 7.72428926 6.17812488 9.30220208]]
 
 Out:
 [[1.21602317 7.98439696 9.05620556 2.83697959]
 [9.3958507  0.4410541  4.69294741 6.21151135]
 [9.35541669 8.771601   2.67657548 1.51640967]
 [3.31345616 1.83102108 4.49195719 2.65633897]
 [4.22969662 9.78383428 4.2059855  3.51768246]
 [4.57385764 3.37973751 2.6408349  0.02887051]
 [6.21546011 0.04066268 7.80298818 9.5587169 ]
 [9.2161083  8.67319937 7.76872335 0.04412195]
 [7.08384209 1.5411624  5.91919843 5.86732347]
 [8.4855305  4.04377759 0.47967926 2.67706414]
 [7.79918004 1.37033309 7.85530482 5.61566313]
 [6.8912464  6.71061425 1.79014265 9.95667328]
 [7.50138137 6.40655105 1.6787647  3.8388262 ]
 [3.84003136 5.61118883 4.591534   6.86230083]
 [5.17758892 0.29655051 6.81150415 8.37906896]]
 
 Out:
 [[6.2008191  5.01440898 1.19431646 1.46966628]
 [3.6884441  5.98279433 8.00901935 9.64732793]
 [3.52759455 3.02354741 5.7056643  6.2760637 ]
 [9.42402217 0.84254578 6.15038093 6.27544049]
 [1.60570361 0.59387563 1.78992139 3.52261129]
 [8.8900899  6.03987706 0.33050256 5.92188544]
 [3.81516281 3.86238878 4.03855837 2.06667167]
 [9.2701765  9.60381624 7.74558517 8.19339026]
 [8.20814064 9.74237378 5.66952719 6.1851182 ]
 [1.7576639  8.29651375 3.922223   8.69603672]
 [2.06396899 6.52352009 4.98091055 5.43478808]
 [5.12493255 7.36592255 4.00586071 7.15660996]
 [6.10261842 7.69178927 7.78526434 2.3107752 ]
 [2.7852262  5.06825957 3.53834605 6.29898952]
 [6.71368228 3.46953372 4.85126099 5.78676554]]

Using Python’s default random module, seeding with no arguments. Different random sequences across subprocesses, experiment is not reproducible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import numpy as np
import random
from multiprocessing import Pool

def worker(seed=None):
	random.seed()
	res = []
	for _ in range(4):
		res.append(random.uniform(0, 10))
	return res

pool = Pool(processes=4)
print(np.array(pool.map(worker, range(15))))

Out:
[[5.81833261 9.9168923  8.93247002 9.90518765]
 [2.90123596 3.91528954 2.77872011 4.87480949]
 [6.19832288 8.34816194 6.88661735 9.47426207]
 [3.96989689 3.22604355 2.95904976 8.34141326]
 [2.296259   5.12486065 0.21062799 8.29904048]
 [5.49116889 0.30380744 7.58570793 5.30868616]
 [1.42948943 9.73802262 4.58700448 7.56218536]
 [8.13709803 2.5842345  8.28861535 8.88060518]
 [4.26313454 2.73609069 5.15700403 3.81322537]
 [0.2053581  8.2047799  3.57169662 7.92371661]
 [0.39280806 3.80576944 6.15093436 8.24473969]
 [2.46185071 3.72478437 4.06629893 0.27102934]
 [1.09800272 3.02180595 7.84631048 7.34041065]
 [3.81548485 6.13159751 1.47117271 5.15804636]
 [5.1527888  2.89648508 9.92809524 2.52398597]]
 
 Out:
 [[0.08005381 7.29528368 8.27069162 9.68627905]
 [2.73432719 3.4192688  7.74288323 7.09917194]
 [7.34550158 1.51663462 5.19498407 7.25172718]
 [6.83526373 8.5079891  3.5465491  5.74123371]
 [9.53125345 8.76415354 7.90748868 2.89947957]
 [1.72001842 3.10226006 7.86063452 4.49702623]
 [8.27568281 0.40086092 1.5762478  0.89076698]
 [0.40298249 8.74680111 4.71490165 2.76464137]
 [7.15458122 2.66073967 3.20191642 4.33114333]
 [7.72341661 7.60516553 7.98281028 7.78942491]
 [2.54719025 1.07420672 4.22804424 9.50822762]
 [3.48335869 3.16231766 2.09045784 5.83409277]
 [1.03204353 8.14189566 0.38457616 8.36397594]
 [7.00119478 6.46496712 8.23629651 1.14136567]
 [1.60403881 7.14953015 8.5172803  7.77329254]]
 
 Out:
 [[4.16353115 4.41836225 9.34061751 5.4246647 ]
 [9.16173418 8.22467751 1.30121419 3.70650714]
 [4.30027175 3.16684564 9.93558304 4.91339036]
 [3.67886406 4.64908281 7.26663676 9.2994314 ]
 [5.24570604 7.59794435 8.58020273 1.03473814]
 [6.02959303 7.53058486 6.78805157 6.69025537]
 [0.6579908  4.67686908 8.17028565 8.70974021]
 [0.58512844 9.17258716 0.05557842 1.97269102]
 [7.04964436 5.3831703  3.89731499 0.23172793]
 [2.59543961 6.7987098  1.01134134 6.64759157]
 [5.68484053 3.6320347  7.35234878 2.86689843]
 [2.55336192 5.39675933 4.25427043 9.25019247]
 [0.13522337 0.48562083 7.89920654 0.71470221]
 [2.15937089 8.53340733 6.3133162  7.40219056]
 [3.72028792 9.99808996 6.77477225 1.43962173]]

Pytorch中多个进程加载随机样本Dataloader解决方法:

https://discuss.pytorch.org/t/does-getitem-of-dataloader-reset-random-seed/8097/7 除了可选择python中的random解决外,

Instead, add this line to the top of your main script (and you need to use python 3)

1
2
3
import torch
import torch.multiprocessing as mp
mp.set_start_method('spawn') 

Pytorch中多个进程加载随机样本Dataloader解决方法(2022年7月16日更新):

知乎 可能95%的人还在犯的PyTorch错误

这个bug的出现需要满足以下两个条件:

  • PyTorch版本 < 1.9。PyTorch < 1.9: torchrandom库产生随机数没有问题,numpy有问题。PyTorch >= 1.9: 官方修复以后,大家都没问题。
  • 在Dataset的 __getitem__ 方法中使用了Numpy的随机数

DataLoader 的构造函数有一个可选参数 worker_init_fn。在加载数据之前,每个子进程都会先调用此函数。我们可以在 worker_init_fn 中设置NumPy的种子。还有一个要注意的点就是: 在默认情况下,每个子进程在epoch结束时被杀死,所有的进程资源都将丢失。在开始新的epoch时,主进程中的随机状态没有改变,用于再次初始化各个子进程,所以子进程的随机数种子和上个epoch完全相同。因此我们需要设置一个会随着epoch数目改变而改变的随机数,但是这个在实际应用中很难实现,因为在 worker_init_fn 中无法得知当前是第几个epoch。幸运的是,torch.initial_seed() 可以满足我们的需求。这个其实也是PyTorch官方的推荐作法: https://pytorch.org/docs/stable/notes/randomness.html#dataloader

为什么torch.initial_seed()可以?

  • 在子进程中运行torch.initial_seed(),返回的就是 torch 当前的随机数种子,即 base_seed + worker_id。因为每个epoch开始时,主进程都会重新生成一个 base_seed,所以 base_seed是随epoch变化而变化的随机数。 此外,torch.initial_seed()返回的是 long int 类型,而Numpy只接受 uint 类型([0, 2**32 - 1]),所以需要对 2**32 取模。
  • 如果我们用 torch 或者 random 生成随机数,而不是 numpy,就不用担心会遇到这个问题,因为PyTorch已经把 torchrandom 的随机数设置为了 base_seed + worker_id
1
2
3
4
5
6
7
8
9
10
11
def worker_init_fn(worker_id):
    worker_seed = torch.initial_seed() % 2**32
    np.random.seed(worker_seed)
    random.seed(worker_seed)

DataLoader(
    train_dataset,
    batch_size=batch_size,
    num_workers=num_workers,
    worker_init_fn=worker_init_fn
)

知乎 可能95%的人还在犯的PyTorch错误的评论精选:

  • 有一个人说: “用random numpy还会有锁死的问题”。他收到另一个网友的回复是: “卧槽,虽然还没实验,这句话解决了一个困扰我两年的问题!我应该就是numpy random导致死锁,worker设为0就没事。一直以为是pytorch的bug”
  • 有一个人说: “之前遇到过,在子进程里面改用 np.random.default_rng 来获取随机数了”
  • 有一个人说: “不过感觉只重复augment部分的随机数,本质上还是枚举了所有可能的数据增强,所以即使对性能有影响,也比较有限吧?” 我其实也有类似的想法,不过不确定。