python - How to avoid cache invalidation in Dockerfile for large binary file (Python_Onbuild) -
i downloading 1.6 gb binary compressed file in dockfile , unpacking using gunzip leads storing 3.6 gb file. not want repeated time takes lot of time. static file should not downloaded every time deploy changes server using jenkins/docker. however, download every time, commit changes, , run jenkins deploy them.
here docker file:
from python:2.7.13-onbuild run mkdir -p /usr/src/app workdir /usr/src/app arg debian_frontend=noninteractive run apt-get update && apt-get install --assume-yes apt-utils run apt-get update && apt-get install -y curl run apt-get update && apt-get install -y unzip run curl -o - https://s3.amazonaws.com/dl4j-distribution/googlenews-vectors-negative300.bin.gz \ | gunzip > /usr/src/app/googlenews-vectors-negative300.bin
update: changed dockfile simple 1 given below:
from python:2.7.13-onbuild run mkdir -p /usr/src/app workdir /usr/src/app run echo "test cache" cmd /usr/local/bin/gunicorn -t 240 -k gevent -w 1 -b 0.0.0.0:8000 --reload src.wsgi:app
now if not change code or other file, works fine command echo "test cache"
not repeated. however, make change file in source folder, commands after following steps repeated think copies source code docker directory. should not happen @ stage because means commands repeated make commit.
here output when no make changes code , run build second time:
sending build context docker daemon 239.1kb step 1/6 : python:2.7.13-onbuild # executing 3 build triggers... step 1/1 : copy requirements.txt /usr/src/app/ ---> using cache step 1/1 : run pip install --no-cache-dir -r requirements.txt ---> using cache step 1/1 : copy . /usr/src/app ---> using cache ---> 1911c6dc9fce step 2/6 : run mkdir -p /usr/src/app ---> using cache ---> 4019b029d05c step 3/6 : workdir /usr/src/app ---> using cache ---> 1a99833e908c step 4/6 : run echo "test cache" ---> using cache ---> 488a62aa1b09
here output make single change 1 of source files , can see echo "test cache" repeated.
sending build context docker daemon 239.1kb step 1/6 : python:2.7.13-onbuild # executing 3 build triggers... step 1/1 : copy requirements.txt /usr/src/app/ ---> using cache step 1/1 : run pip install --no-cache-dir -r requirements.txt ---> using cache step 1/1 : copy . /usr/src/app ---> 6fd1003e246a removing intermediate container f25a4d2910cf step 2/6 : run mkdir -p /usr/src/app ---> running in ff324f381875 ---> 3694086a2b6a removing intermediate container ff324f381875 step 3/6 : workdir /usr/src/app ---> 5f23ab9a15df removing intermediate container 0b0d796f97d0 step 4/6 : run echo "test cache" ---> running in 296d2f141015 test cache ---> f90c7708d9eb
all commands repeating because using python:2.7.13-onbuild
base image. it's docker-file looks this:
from python:2.7 run mkdir -p /usr/src/app workdir /usr/src/app onbuild copy requirements.txt /usr/src/app/ onbuild run pip install --no-cache-dir -r requirements.txt onbuild copy . /usr/src/app
as using base image, copy command executed before commands in docker file , copy command changes context every time make change in source code.
i recommended use python:2.7 base image directly have more control on copy operation. new docker-file follows copy command @ end solved issue.
from python:2.7 run mkdir -p /usr/src/app workdir /usr/src/app arg debian_frontend=noninteractive run apt-get update && apt-get install --assume-yes apt-utils run apt-get update && apt-get install -y curl run apt-get update && apt-get install -y unzip run curl -o - https://s3.amazonaws.com/dl4j-distribution/googlenews-vectors-negative300.bin.gz \ | gunzip > /usr/src/app/googlenews-vectors-negative300.bin copy requirements.txt /usr/src/app/ run pip install --no-cache-dir -r requirements.txt copy . /usr/src/app
according documentation, use of python_onbuild image discouraged.
this explanation inspired answer of other question on same issue: which python variant use base image in dockerfiles?
Comments
Post a Comment