ThoughtWorks
  • 联系我们
  • Español
  • Português
  • Deutsch
  • English
概况
  • 工匠精神和科技思维

    采用现代的软件开发方法,更快地交付价值

    智能驱动的决策机制

    利用数据资产解锁新价值来源

  • 低摩擦的运营模式

    提升组织的变革响应力

    企业级平台战略

    创建与经营战略发展同步的灵活的技术平台

  • 客户洞察和数字化产品能力

    快速设计、交付及演进优质产品和卓越体验

    合作伙伴

    利用我们可靠的合作商网络来扩大我们为客户提供的成果

概况
  • 汽车企业
  • 清洁技术,能源与公用事业
  • 金融和保险企业
  • 医疗企业
  • 媒体和出版业
  • 非盈利性组织
  • 公共服务机构
  • 零售业和电商
  • 旅游业和运输业
概况

特色

  • 技术

    深入探索企业技术与卓越工程管理

  • 商业

    及时了解数字领导者的最新业务和行业见解

  • 文化

    分享职业发展心得,以及我们对社会公正和包容性的见解

数字出版物和工具

  • 技术雷达

    对前沿技术提供意见和指引

  • 视野

    服务数字读者的出版物

  • 数字化流畅度模型

    可以将应对不确定性所需的数字能力进行优先级划分的模型

  • 解码器

    业务主管的A-Z技术指南

所有洞见

  • 文章

    助力商业的专业洞见

  • 博客

    ThoughtWorks 全球员工的洞见及观点

  • 书籍

    浏览更多我们的书籍

  • 播客

    分析商业和技术最新趋势的精彩对话

概况
  • 申请流程

    面试准备

  • 毕业生和变换职业者

    正确开启技术生涯

  • 搜索工作

    在您所在的区域寻找正在招聘的岗位

  • 保持联系

    订阅我们的月度新闻简报

概况
  • 会议与活动
  • 多元与包容
  • 新闻
  • 开源
  • 领导层
  • 社会影响力
  • Español
  • Português
  • Deutsch
  • English
ThoughtWorks菜单
  • 关闭   ✕
  • 产品及服务
  • 合作伙伴
  • 洞见
  • 加入我们
  • 关于我们
  • 联系我们
  • 返回
  • 关闭   ✕
  • 概况
  • 工匠精神和科技思维

    采用现代的软件开发方法,更快地交付价值

  • 客户洞察和数字化产品能力

    快速设计、交付及演进优质产品和卓越体验

  • 低摩擦的运营模式

    提升组织的变革响应力

  • 智能驱动的决策机制

    利用数据资产解锁新价值来源

  • 合作伙伴

    利用我们可靠的合作商网络来扩大我们为客户提供的成果

  • 企业级平台战略

    创建与经营战略发展同步的灵活的技术平台

  • 返回
  • 关闭   ✕
  • 概况
  • 汽车企业
  • 清洁技术,能源与公用事业
  • 金融和保险企业
  • 医疗企业
  • 媒体和出版业
  • 非盈利性组织
  • 公共服务机构
  • 零售业和电商
  • 旅游业和运输业
  • 返回
  • 关闭   ✕
  • 概况
  • 特色

  • 技术

    深入探索企业技术与卓越工程管理

  • 商业

    及时了解数字领导者的最新业务和行业见解

  • 文化

    分享职业发展心得,以及我们对社会公正和包容性的见解

  • 数字出版物和工具

  • 技术雷达

    对前沿技术提供意见和指引

  • 视野

    服务数字读者的出版物

  • 数字化流畅度模型

    可以将应对不确定性所需的数字能力进行优先级划分的模型

  • 解码器

    业务主管的A-Z技术指南

  • 所有洞见

  • 文章

    助力商业的专业洞见

  • 博客

    ThoughtWorks 全球员工的洞见及观点

  • 书籍

    浏览更多我们的书籍

  • 播客

    分析商业和技术最新趋势的精彩对话

  • 返回
  • 关闭   ✕
  • 概况
  • 申请流程

    面试准备

  • 毕业生和变换职业者

    正确开启技术生涯

  • 搜索工作

    在您所在的区域寻找正在招聘的岗位

  • 保持联系

    订阅我们的月度新闻简报

  • 返回
  • 关闭   ✕
  • 概况
  • 会议与活动
  • 多元与包容
  • 新闻
  • 开源
  • 领导层
  • 社会影响力
博客
选择主题
查看所有话题关闭
技术 
敏捷项目管理 云 持续交付 数据科学与工程 捍卫网络自由 演进式架构 体验设计 物联网 语言、工具与框架 遗留资产现代化 Machine Learning & Artificial Intelligence 微服务 平台 安全 软件测试 技术策略 
商业 
金融服务 全球医疗 创新 零售行业 转型 
招聘 
职业心得 多元与融合 社会改变 
博客

话题

选择主题
  • 技术
    技术
  • 技术 概观
  • 敏捷项目管理
  • 云
  • 持续交付
  • 数据科学与工程
  • 捍卫网络自由
  • 演进式架构
  • 体验设计
  • 物联网
  • 语言、工具与框架
  • 遗留资产现代化
  • Machine Learning & Artificial Intelligence
  • 微服务
  • 平台
  • 安全
  • 软件测试
  • 技术策略
  • 商业
    商业
  • 商业 概观
  • 金融服务
  • 全球医疗
  • 创新
  • 零售行业
  • 转型
  • 招聘
    招聘
  • 招聘 概观
  • 职业心得
  • 多元与融合
  • 社会改变
云语言、工具与框架技术

Reproducible work environments using Docker

Jonathan Heng Jonathan Heng

Published: Sep 23, 2019

This article covers the basics of using Docker to control dependencies ranging from operating system to packages. While we use Python as an example here, the concepts are equally applicable to any other programming language.

A common pitfall that Python users fall into is dependency management. People are often unsure of how to set up virtual environments or how to reproduce an environment. Often, I see people simply running  pip install any library onto the local environment as a global dependency. 

 These are common challenges that anyone would face in the real world:
  • Needing different versions of the same library for different projects
  • Losing track of the required libraries for a specific project
  • Requiring a different Python version
  • Setting up projects on a new team member’s machine (which could have a different OS as well) being painful and time-consuming
  • Automating deployment was impossible given that the entire process of setting up was convoluted and manual

I’m sure these are common issues that other developers face as well, which explains the existence of numerous tools that solve the same problems. Tools for managing Python dependencies include `pyenv`, `venv`, `virtualenv` and `pyvenv` (which is deprecated in favor of venv since Python 3.6). Unfortunately, many of these tools have very similar names and some have identical purposes. It can be a steep learning curve for a newcomer to the project and/or Python to understand these tools.
 


Next, we have command line interface (CLI) tools dependencies as well. Let’s take an example of one of the most common CLI tools we use in software development, git. In our example scenario, we want to replicate our working environment on a machine without git. One approach is to write a script to install the git CLI tool. In Mac OS X, we can add the command brew install git to our script. All our team members now have access to git as long as they run the setup script.

In more complicated projects, we have team members using different operating systems such as Linux and Windows. Our old setup script for Mac OS X no longer works. We could add a conditional check in our script to check for the type of OS and run a corresponding command to install git before we realize in horror that there’s no straightforward way to install git on Windows using a script on the default command prompt. (It’s still possible to automate this on Windows, but it requires other dependencies and I leave it up to you to discover the details.)

Keeping track of all these tools and branching scripts that we use to ensure similar working environments can become a pain too. Each time we have a new CLI tool, we have to write different installation scripts for each OS. On top of that, there are also OS limitations that are difficult to overcome.

Wouldn’t it be great if there was one single tool that could set up our entire working environment?

Docker to the rescue!

Docker allows us to manage the following dependencies in a single place:
  • OS dependencies
  • CLI tools dependencies
  • Python dependencies
By properly defining a Dockerfile in our project repository, we can create an environment separate from our local environment that contains all the dependencies we need. This eliminates all possible conflicts with user preferences and local machine set up, as long as the users make use of the Docker container. Let us now look in detail on how we can set up a Dockerfile that solves the above issues.

Before we begin, a few words about the Dockerfile:
  1. It is a step-by-step instruction that tells Docker how to build a Docker image
  2. There are a standard set of instructions it can run, such as FROM, RUN, COPY, WORKDIR.
  3. After preparing the Dockerfile, the image is built by running docker build <path to Dockerfile> or docker build . from the directory of the Dockerfile.

Defining OS dependencies

The first thing we do is to specify our base image. Let us choose from a list of suitable Docker images for Python. We shall use the 3.6-slim version for our example. Our first line in our Dockerfile will look like this:
FROM python:3.6-slim
How do we find out what exact OS this image is using? We can find the Dockerfile of the 3.6-slim version from the official Python Docker images, which specifies how this base image was built. The first line of that Dockerfile states:
FROM debian:stretch-slim
This tells us that the OS is a Debian Linux distribution.

Defining Python dependencies

Since we are creating an entirely separate environment using a Docker image, we don’t need to worry about managing Python versions or virtual environments. What we need to do is define our Python dependencies in a text file. In the following example, we’ll name our text file requirements.txt. We can manually define the packages we need or extract the existing dependencies from a working environment using the following command:
pip freeze > requirements.txt
After we have our requirements.txt ready, we can add the following commands to our Dockerfile:
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
The first line will copy the requirements.txt file in our local working directory to the Docker image’s working directory. The second line runs the command pip install -r requirements.txt, which will install all the libraries you need in your Docker image.

Defining CLI tools dependencies

We can run commands during the building of our Docker image to install CLI tools. Since we are using a Debian Linux distribution, we can use apt-get to install curl and git. The snippet below shows what we need to add to our Dockerfile to do so.
RUN apt-get update \
    && apt-get install -y curl git
Any other CLI tools you need can be installed in a similar fashion. Just add the commands that you need to run into the Dockerfile.

Working with the Docker image

Next, in order to run our code in the Docker image, we will mount our local code directory to a working directory. We can set a working directory in our Dockerfile, like so:
WORKDIR /workdir
You can replace /workdir with any working directory you prefer. WORKDIR does multiple things including setting the default directory of running further Dockerfile commands and setting the default entry point to the container. 

Now let us take a look again at our Dockerfile.
FROM python:3.6-slim
FROM apt-get update \
     && apt-get install -y git
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
WORKDIR /workdir
After defining all the above dependencies in the Dockerfile, it’s time to build it. Run the following command:
docker build . -t my_project_image
This searches your current directory for a Dockerfile and tags the image with the name my_project_image. 

Next, run the Docker image as a container:
docker run -it \
-v $(pwd):/workdir
my_project_image bash
Line by line explanation of the above command:
  1. The default docker run command followed by the -i and -t flags when used together (as -it) allows us to run the Docker container as an interactive process
  2. The -v flag allows us to mount our directory on to the container’s directory. Here we specify our current directory with $(pwd) followed by a colon (:) and specify the Docker container’s directory /workdir
  3. In our last line, we specify the image name to run as a container and the entry process bash
Our Docker container is now up and running with everything we need for developing our project (e.g.OS dependencies, Python dependencies and CLI tools). The simplified setup process no longer requires OS-specific setup and Python version/virtual environment management tools. The only requirements are to have Docker installed on the target machine and running the docker build and docker run commands.

I hope that this article has shown how Docker can be used to effectively manage various dependencies in a project. While the example here is targeted towards setting up an environment in a Python project, using Docker to ensure reproducible environments is just as applicable to any other project.

Technology Radar

Don't miss our opinionated guide to technology frontiers.

Subscribe
相关博客
语言、工具与框架

Python Working Environment for Beginners: Part 1

Surya Sreedevi Vedula
了解更多
语言、工具与框架

Python Working Environment for Beginners: Part 2

Surya Sreedevi Vedula
了解更多
语言、工具与框架

Works on my machine… and also everywhere else: local build and testing environments as code

Charles Korn
了解更多
  • 产品及服务
  • 合作伙伴
  • 洞见
  • 加入我们
  • 关于我们
  • 联系我们

WeChat

×
QR code to ThoughtWorks China WeChat subscription account

媒体与第三方机构垂询 | 政策声明 | Modern Slavery statement ThoughtWorks| 辅助功能 | © 2021 ThoughtWorks, Inc.