[python] Dask Parallel Computing Example

Here is an example of using Dask's Client to set up a distributed cluster and perform a computation in parallel:

from dask.distributed import Client

# Start a local cluster with 2 worker threads
client = Client(n_workers=2)

# Define a computation as a normal Python function
def my_computation(x):
    return x * x

# Use the client to submit the computation as a task to the cluster
results = client.map(my_computation, range(10))

# Wait for the computations to complete and retrieve the results
results = client.gather(results)

print(results)

In this example, we use the Client class to start a local Dask cluster with 2 worker threads. Then we define a simple computation as a normal Python function my_computation. We use the map method of the Client instance to submit the computation as tasks to the cluster, passing a range of inputs. Finally, we use gather method to wait for the computations to complete and retrieve the final results. The gather method returns the final result as a list.

You can also use Client to connect to a distributed cluster, running on a different machines or in a cloud. You just need to pass the address of the scheduler to the Client constructor.

client = Client("tcp://<SCHEDULER_IP>:8786")

In this case, you need to make sure that the cluster is already running and the scheduler is reachable.

Note that, before running this code, you need to make sure that Dask distributed is installed. You can install it by running pip install dask[distributed] command.

'Data Science > Python' 카테고리의 다른 글

[Python] Transfer Pandas Dataframe to MYSQL database with SSH (0)	2023.02.22
[python] Chrome web-driver options : speed up page loading. (0)	2023.01.31
[python] About Dask (0)	2023.01.28
[Python] How to Async Http Request : 비동기 HTTP 요청 방법 (0)	2022.09.30
[Python] 디렉토리 및 파일 변경 감시 모듈 : WatchDog (0)	2022.09.29