I have the following doubts:
- do openMP make pthread calls?
- how threads are created in openMP?
- is openMP a replacement to pthreads? or openMP and pthreads are entirely different?
- if openMP and pthreads are different, then which gives better parallelism at C level i.e in openBLAS math library, openBLAS + openMP is better or openBLAS + pthreads is better?
OpenMP is a cross-platform standard. The standard can be implemented in any way the implementor wants. Obviously on a platform without the POSIX threads library like Windows, OpenMP will not be implemented via pthreads. Since pthreads itself is a cross-platform standard, the OpenMP library may use it or go straight for the platform-specific low-level interface.
However, the OpenMP implementations provided by GCC and Clang do indeed call pthreads, as far as I know. At the very least they are compatible so that you can mix-and-match the libraries, e.g. use pthread's thread-local variables in conjunction with OpenMP's.
Again, specific to the implementation. Normally you don't need to worry about it
The OpenMP interface caters to very specific styles of parallelization, like classic fork-join parallelization of loops. Pthreads is more general-purpose but requires you to do a lot of the things manually that OpenMP provides, such as distributing work across threads.
When the programming model of OpenMP fits your use-case, it will save you work and brings with it low-level performance tunings that fit this style of parallelization. For example OpenMP has a thread-pool, handles CPU binding, and its synchronization primitives are tuned / tuneable to its style of parallelization (using longer spin-counts instead of sleeping directly).
As far as OpenBLAS or FFTW are concerned, I see the main benefit in that the OpenMP version can reuse the thread pool instead of using one thread pool per library. This reduces the number of context switches.