Well there's a few parts to this question - in general a good rule of thumb is to run no more threads than you have logical processors - though this is usually for the whole system, and may depend on load. To find out how many physical processor cores you have, you can use cat /proc/sysinfo
. It'll print a set of lines for each logical core so scroll down and look at the last one (I have 8 almost identical ones on my quad core, HT system)
processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz stepping : 9 microcode : 0x16 cpu MHz : 3401.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 6819.66 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:
I'll pick out the important lines here physical id: 0 (this is the first socket - if you use more than one socket then check the processor and cpu cores for each physical jd - if this is a number greater than 0 you have multiple sockets)
Processor : 7 (This number starts from 0, to n-1,this is the 8th logical core in its socket - looking at the largest number you have for a set of values sharing a physical id )
cpu cores : 4 (I have 4 physical cores - this will be the same for every core, and and since SMP generally uses identical cores, should be the same on a dual socket system)
My processor should allow me to run 8 threads simultaneously, assuming a core per thread. That said, depending on the run time, and other factors you may be able to get away with more
SO has quite a few questions on this and picking two of those, the answers to this question suggest that one thread per logical core is a good idea though this one suggests you may be able to go higher. As such, unfortunately the answer is to start with one thread per process, and tune it higher - which may be an insanely high number of threads, if they arn't long running, memory hungry threads.