You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a modern data engineer knowledge of concurrency models is important.
A data engineer should know the difference between concurrency and parallelism.
A data engineer should know the difference between task parallelism and data parallelism.
Threads vs. processes. Example in Python: libraries threading vs multiprocessing, what are the differences, and what problems does Python have with threading.
A pretty typical scenario for modern data integration: call n APIs each x sec / min / hours. How to do that with a good performance? One of the ways would be to use asynchronous programming.
Actor model might be good to know as well.
DAG (example: Apache Airflow) vs state machines (example: Amazon Step Functions) vs ... . Is actually covered by 'Data structures and algorithms', but maybe would be good to mention this as an example of how knowledge of them might be helpful for a data engineer.
Parallel programming using techniques like CUDA on GPU.
Functional programming is also 'nice to have' (but not obligatory).
If you agree on at least some of the points, I can prepare the text.
The text was updated successfully, but these errors were encountered:
Hey, these are really good points! I'll def consider adding these to the image when I update it next time. Feel free to create a PR and add it to the markdown version. Thanks a lot for the contribution!
For a modern data engineer knowledge of concurrency models is important.
threading
vsmultiprocessing
, what are the differences, and what problems does Python have with threading.Apache Airflow
) vs state machines (example:Amazon Step Functions
) vs ... . Is actually covered by 'Data structures and algorithms', but maybe would be good to mention this as an example of how knowledge of them might be helpful for a data engineer.CUDA
on GPU.If you agree on at least some of the points, I can prepare the text.
The text was updated successfully, but these errors were encountered: