Metadata-Version: 2.1
Name: py-dsm
Version: 1.0.1
Summary: Python true-shared memory parallel computation
Home-page: https://github.com/matloff/pydsm
Author: Daniel Guo, Norm Matloff
Author-email: zhyguo@ucdavis.edu, matloff@cs.ucdavis.edu
License: UNKNOWN
Description: # Pydsm
        
        While the multi-thread parallel performance is limited in Python because of 
        the Global Interpreter Lock, Pydsm provided parallel computation that
        truly shares memory between processes in a distributed way.
        
        
        ## Requirements
        
        * SharedArray
        * numpy
        * POSIX (Linux, variants of Unix including macOS, etc.)
        * Python 2 is supported, but Python 3 is recommended
        
        You can install py-dsm via the `pip` command:
        
        ```
        pip install py-dsm
        ```
        
        ## Authors
        
        Zhiyuan Guo
        
        Norm Matloff
        
        
        ## Usage
        
        
        ### Start the processes
        
        To create a cluster instance of 4 worker processes, one will do:
        
        ```
        with pydsm.Cluster(4) as cluster:
        ```
        
        ### Global array
        
        In pydsm, all global arrays in shared memory are numpy arrays. 
        To create a global array, one needs to first create a cluster instance 
        and then call `createShared()`.
        
        ```
        a = cluster.createShared(name = "A", shape = 10, dataType = int)
        ```
        
        The user needs to provide the global array with an arbitrary but unique name, 
        the shape (i.e., dimension) of the array, and the datatype (default is int).
        The shape and datatype arguments will follow the same format as those typical 
        numpy functions such as `numpy.zeros()`. The returned array is an array with
        all elements initialized to zeros. All names of the global arrays should be 
        unique.
        
        ### Run the processes
        Then, `runProcesses(func, paras)` is invoked 
        to run the user's function in parallel. `func` is the name of the
        user-defined parallel function. `paras`, defaulted to an empty tuple, 
        is a tuple consisting of parameters passed in to the user's function.
        Each worker process will get a copy of those parameters in `paras`.
        
        
        ```
        cluster.runProcesses(foo)
        ```
        
        
        ### Parallelized function
        
        The function `foo` implemented by the user will be executed 
        by 4 processes simultaneously. 
        **The first argument of a user's parallel function is mendatory.**
        This mendatory parameter is a dictionary that contains **resource**
        you may need for your parallel function.
        The resource has references to the id of each process, 
        the global array(s), and a lock.
        
        
        ```
        def foo(res, ...):
        	# res is the resource
        	# ... means any extra parameters you may need
        	# Code
        ```
        
        
        
        ### Locks and barriers
        
        To use a lock in the parallelized function,
        the lock first needs to be retrieved from
        'resource' (the first parameter).
        
        ```
        lock = res['lock']
        ```
        
        Then, the user can apply the lock anywhere
        they want in the function.
        
        ```
        lock.apply()
        # Critical section
        lock.release()
        ```
        
        The barrier is invoked in the following way.
        
        ```
        pydsm.Cluster.barrier()
        ```
        
        ### Split the tasks
        
        The user can invoke `splitIndices(n, id, random=True)` inside the user's
        parallel function to distribute the tasks. 
        `splitIndices` will return to each worker process a list of indices (numbers).
        
        Say we want to compute Y = AX in parallel by splitting the rows of A into
        chunks.  We have 100 rows in A and 10 processors. We can let each
        process take 10 rows. 
        
        ```
        myid = resource['id']
        myidxs = pydsm.Cluster.splitIndices(100, myid, random=True)
        Y[myidxs, ] = np.matmul(A[myidxs,], X)
        ```
        
        In this scenario, `n` will be 100.
        It is recommended to set `random` to True to 
        split the rows randomly and thus achieve better load balancing.
        Otherwise, process 0 will take the first ten rows, process 1 will take the
        second ten rows, and so on, which may result in inefficient load balancing.
        
        
        
        ## Example
        This is an embarrassingly parallel example that illustrates the usage
        of the pydsm. Two vectors are added in parallel.
        The idea is to break A and B into chunks, and have each worker process
        work on one chunk.
        The complete source code is in `doc/examples/vecAdd.py`.
        
        A and B are the two vectors to be added.
        The result will be stored in C, as the computation is carried out.
        Thus, all of them need to be shared variables.
        They are set up in the following way.
        
        
        ```
        with pydsm.Cluster(4) as cluster:
        	A = cluster.createShared(name = "A", shape = 10, dataType = int)
        	B = cluster.createShared("B", 10, int)
        	C = cluster.createShared("C", 10, int)
        ```
        
        Now A and B are shared arrays backed by numpy.
        We can use and treat them the same way we do to numpy arrays.
        
        ```
        	A[:] = np.arange(10) 
        	# A is now "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
        	B[:] = np.arange(10)
        	B += 1 
        	# B is now "array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
        ```
        
        We then call `runProcesses()`, with the first parameter being
        the name of the parallel function, which we will define later.
        The second parameter is a tuple of parameters used by parallel function.
        Here we pass in 10 as the length of each vector.
        
        ```
        	cluster.runProcesses(add, paras=(10,))
        ```
        
        Next, we define the parallel function `add`.
        
        ```
        def add(resource, n):
            myid = resource['id']
            A = pydsm.Cluster.getShared("A")
            B = pydsm.Cluster.getShared("B")
            C = pydsm.Cluster.getShared("C")
        
            myidxs = pydsm.Cluster.splitIndices(n, myid, random=False)
            C[myidxs] = A[myidxs] + B[myidxs]
        ```
        
        Each worker process first obtains an unique id, and then
        attaches A, B, and C to the corresponding shared variable.
        `splitIndices()` is then called to give each worker a list of indices.
        Each worker will then compute elements in the array corresponding to those
        indices.
        Note that C is a shared variable, so `add` does not need to return C.
        
        For illustrative purpose of the usage of barriers,
        we can have the following code at the end of `add`.
        The process 1 will print out C after the computation is done.
        
        ```
        # Below is just for illustrative purpose
        pydsm.Cluster.barrier()
        if myid == 1:
        	# In some versions of python, printing C directly may cause issues.
        	# It is better to first convert the SharedArray into an numpy array
        	# and then print it. So do np.array(C) before printing
        	print("Check out vector C in processes: ", np.array(C))
        	# C will be '[ 1  3  5  7  9 11 13 15 17 19]'
        ```
        
        
        ## More examples
        
        For more sample applications using pydsm such as finding prime numbers and
        matrix multiplication, they are under the directory `inst/`.
        
        ## Notes
        
        If your program is terminated abnormally, py-dsm may not compeletely delete
        your shared variables. When you run your program next time, you may encounter
        the following error message because your program is trying to create a
        shared variable that is still alive.
        
        ```
        # You are trying to create a shared variable named 'A', but there is
        # already a shared var named 'A'. 
        FileExistsError: [Errno 17] File exists: 'shm://A'
        ```
        
        In this scenario, you need to delete the shared variable yourself in the Python
        interpreter.
        
        ```
        >>> import SharedArray as sa
        >>> sa.delete("A")
        ```
        
        
        
Keywords: shared memory parallel concurrent multiprocessing numpy
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Scientific/Engineering
Classifier: Operating System :: Unix
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
