I am active.
I am trying to move all memory access to the remote NUMA node(i.e., node1), using numatop or other tools, if possible. However, seems to me the Numpy that uses OpenBLAS internally does not go to node1.
This is the workload I am testing, just a simple matrix multiplication. Running...