-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
1255 lines (949 loc) · 133 KB
/
index.html
File metadata and controls
1255 lines (949 loc) · 133 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<!-- Google Analytics -->
<script type="text/javascript">
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-128479674-2', 'auto');
ga('send', 'pageview');
</script>
<!-- End Google Analytics -->
<title>Kai | 凯</title>
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<meta property="og:type" content="website">
<meta property="og:title" content="Kai | 凯">
<meta property="og:url" content="https://kaichu.se/index.html">
<meta property="og:site_name" content="Kai | 凯">
<meta property="og:locale" content="en_US">
<meta property="article:author" content="Kai Chu">
<meta name="twitter:card" content="summary">
<link rel="alternate" href="/atom.xml" title="Kai | 凯" type="application/atom+xml">
<link rel="icon" href="/favicon.ico">
<link href="//fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css">
<link rel="stylesheet" href="/css/style.css">
<meta name="generator" content="Hexo 5.0.0"></head>
<body>
<div id="container">
<div id="wrap">
<header id="header">
<div id="banner"></div>
<div id="header-outer" class="outer">
<div id="header-title" class="inner">
<h1 id="logo-wrap">
<a href="/" id="logo">Kai | 凯</a>
</h1>
</div>
<div id="header-inner" class="inner">
<nav id="main-nav">
<a id="main-nav-toggle" class="nav-icon"></a>
<a class="main-nav-link" href="/">Home</a>
<a class="main-nav-link" href="/archives">Archives</a>
</nav>
<nav id="sub-nav">
<a id="nav-rss-link" class="nav-icon" href="/atom.xml" title="RSS Feed"></a>
<a id="nav-search-btn" class="nav-icon" title="Search"></a>
</nav>
<div id="search-form-wrap">
<form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" class="search-form-input" placeholder="Search"><button type="submit" class="search-form-submit"></button><input type="hidden" name="sitesearch" value="https://kaichu.se"></form>
</div>
</div>
</div>
</header>
<div class="outer">
<section id="main">
<article id="post-How-to-save-variables-in-session-or-query-results-into-files-in-gatling" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Gatling/2021/10/05/How-to-save-variables-in-session-or-query-results-into-files-in-gatling.html" class="article-date">
<time datetime="2021-10-05T00:00:00.000Z" itemprop="datePublished">2021-10-05</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Gatling/">Gatling</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Gatling/2021/10/05/How-to-save-variables-in-session-or-query-results-into-files-in-gatling.html">How to save variables in gatling session or query results into files</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>This post is for users who is using gatling gradle plugin <code>io.gatling.gradle</code> and trying to solve the runtime resource path problem with <code>./bin/gatling.sh</code> after building the bundle.</p>
<h1 id="Background"><a href="#Background" class="headerlink" title="Background"></a>Background</h1><p>I have intially setup a project following the gradle plugin, however, after developing the load tests. I realized it’s not super fun to run it with gradle in a remote host. I decided to pack everything in a (bundle structure)[<a target="_blank" rel="noopener" href="https://gatling.io/docs/gatling/reference/current/general/bundle_structure/]">https://gatling.io/docs/gatling/reference/current/general/bundle_structure/]</a> so that I can use command line to run my tests anywhere, this post will not go into how I build the bundle. It works out in the beginning for a simple Simulation, however, as long as I started to use resources, specially I use a <a target="_blank" rel="noopener" href="https://gatling.io/docs/gatling/reference/current/session/feeder/">Feeder</a> to inject some users’ data, which resources folder my test is using keeps bothers me in the gradle project and in the bundle. </p>
<p>The gradle project has following files:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">project</span><br><span class="line">+ build</span><br><span class="line">++ classes</span><br><span class="line">++ resources</span><br><span class="line">+++ MyFile.csv</span><br><span class="line">+ src</span><br><span class="line">++ gatling</span><br><span class="line">+++ resources</span><br><span class="line">++++ myFile.csv</span><br><span class="line">+++ scala</span><br><span class="line">++++ MySimulation</span><br></pre></td></tr></table></figure>
<p>The structure of the gatling bundle:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">bundle</span><br><span class="line">+ bin</span><br><span class="line">++ gatling.sh</span><br><span class="line">+ conf</span><br><span class="line">++ gatling.conf</span><br><span class="line">+ lib</span><br><span class="line">+ user-files</span><br><span class="line">++ resources</span><br><span class="line">+++ myFile.csv</span><br><span class="line">++ simulations</span><br><span class="line">+++ MySimulation</span><br></pre></td></tr></table></figure>
<h1 id="What-is-the-problem-and-Where-I-meet-it"><a href="#What-is-the-problem-and-Where-I-meet-it" class="headerlink" title="What is the problem and Where I meet it"></a>What is the problem and Where I meet it</h1><p>In gatling, it’s quite common that we utilize the <a target="_blank" rel="noopener" href="https://gatling.io/docs/gatling/reference/current/session/session_api/">Session</a> to save or fetch data for a virtul user.</p>
<p>In a recent project, I realized that we might have a few Simulations which might not be ran at the same time, in fact, one Simulation may use the results from another Simulation or we need to log the results from on Simulation. </p>
<p>So I decide to use a PrintWriter to write some session variables into a file which is under the gradle project. I also want to persist it in the git repo so that I can rerun the Simulation which needs the results.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">.exec(session => {</span><br><span class="line"> val writer = new PrintWriter(new FileOutputStream(new File(src/gatling/resources/myFile.csv), true))</span><br><span class="line"> writer.write(session("username").as[String] + "," + session("sessionKey").as[String])</span><br><span class="line"> writer.write("\n")</span><br><span class="line"> writer.close()</span><br><span class="line"> session</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<h2 id="The-runtime-resource-path-problem"><a href="#The-runtime-resource-path-problem" class="headerlink" title="The runtime resource path problem"></a>The runtime resource path problem</h2><p>Running gatling Simulations with gradle task in IDE is super great, however, as I packed things in a bundle, it’s easily known that the above codes won’t work, as there is no src/gatling anymore. There’s a resources folder in the gatling class path if we check the bash gatling.sh</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">GATLING_CLASSPATH="$GATLING_HOME/lib/*:$GATLING_HOME/user-files/resources:$GATLING_HOME/user-files:$GATLING_CONF:"</span><br></pre></td></tr></table></figure>
<p>which means we could use the classloader to help us to find the resources by getClass.getResource(“/myFile.csv”).getPath, it should give me the path to the file.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">.exec(session => {</span><br><span class="line"> val writer = new PrintWriter(new FileOutputStream(new File(getClass.getResource("/myFile.csv").getPath), true))</span><br><span class="line"> writer.write(session("username").as[String] + "," + session("sessionKey").as[String])</span><br><span class="line"> writer.write("\n")</span><br><span class="line"> writer.close()</span><br><span class="line"> session</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<h2 id="The-gradle-src-resource-path-problem"><a href="#The-gradle-src-resource-path-problem" class="headerlink" title="The gradle src resource path problem"></a>The gradle src resource path problem</h2><p>With the code changes, we can still run the Simulations with gradle, however, as gradle will copy src/resources into the build/resources. The classloader shall be able to find the filepath in the build/resources. Whenever I run the simulations, I will have files generated in the build/resources, which will be deleted when I clean the project, by intention or accidentally.<br><code>Oh no, I need this file and I copied it to my src/resources each time I run it</code>, it’s not fun but it’s fine…</p>
<h1 id="Easy-solution"><a href="#Easy-solution" class="headerlink" title="Easy solution"></a>Easy solution</h1><p>Put an environment variable Bundle in the gatling.sh and uses following wrappers to get a file path</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"> // env Bundle=true</span><br><span class="line"></span><br><span class="line">def testGeneratedFilePath(filename: String): String = {</span><br><span class="line"> var path = ""</span><br><span class="line"> val isBundle = System.getenv("Bundle") != null</span><br><span class="line"> if (isBundle)</span><br><span class="line"> path = s"src/gatling/resources/${filename}"</span><br><span class="line"> else</span><br><span class="line"> path = getClass.getResource(s"/${filename}").getPath</span><br><span class="line"> path</span><br><span class="line">}</span><br><span class="line"> </span><br><span class="line">val testFileAbsolutePath = new File(testGeneratedFilePath("myFile.csv"));</span><br></pre></td></tr></table></figure>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Gatling/2021/10/05/How-to-save-variables-in-session-or-query-results-into-files-in-gatling.html" data-id="ckuelnhqs0060exoj25ve9kzb" class="article-share-link">Share</a>
<a href="https://kaichu.se/Gatling/2021/10/05/How-to-save-variables-in-session-or-query-results-into-files-in-gatling.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Gatling/" rel="tag">Gatling</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Loadtest/" rel="tag">Loadtest</a></li></ul>
</footer>
</div>
</article>
<article id="post-Setup-two-stages-of-Dockerfile-to-build-a-nodejs-app" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Docker/2021/04/18/Setup-two-stages-of-Dockerfile-to-build-a-nodejs-app.html" class="article-date">
<time datetime="2021-04-18T21:08:13.000Z" itemprop="datePublished">2021-04-18</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Docker/">Docker</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Docker/2021/04/18/Setup-two-stages-of-Dockerfile-to-build-a-nodejs-app.html">Setup two stages of Dockerfile to build a nodejs app</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>In this post, I’ll create a simple http server app with nodejs and setup build scripts to bundle a project into a dist folder.<br>After validating that works well, I’ll create a Dockerfile to build the nodejs app in a container and then create a final image with only the dist folder as a new image layer.</p>
<h1 id="An-simple-Nodejs-App"><a href="#An-simple-Nodejs-App" class="headerlink" title="An simple Nodejs App"></a>An simple Nodejs App</h1><ul>
<li>Create a folder <code>mkdir -p ~/two-stages-docker-build-nodejs-app</code> and go to that folder <code>cd ~/two-stages-docker-build-nodejs-app</code> to intilize an npm project <code>npm init --yes</code>. </li>
<li>Add license <code>npx license Apache-2</code></li>
<li>Create a src folder <code>mkdir -p ~/two-stages-docker-build-nodejs-app/src</code> and add a index.js file <code>cd ~/two-stages-docker-build-nodejs-app/src && touch index.js</code> with a simplest http server.<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">var http = require('http');</span><br><span class="line"></span><br><span class="line">//create a server object:</span><br><span class="line">http.createServer(function (req, res) {</span><br><span class="line"> res.write('Hello World!'); //write a response to the client</span><br><span class="line"> res.end(); //end the response</span><br><span class="line">}).listen(8080); //the server object listens on port 8080</span><br></pre></td></tr></table></figure></li>
<li>We want to have gulp as a bundle tool and add babel, uglify and rename pipe to it<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">npm install @babel/core gulp gulp-babel gulp-uglify gulp-rename --save-dev</span><br></pre></td></tr></table></figure></li>
<li>Create a <code>gulpfile.js</code> and add a simple build task in it, we take everything from src folder and put the result into dist folder<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">const { src, dest, parallel } = require('gulp');</span><br><span class="line">const babel = require('gulp-babel');</span><br><span class="line">const uglify = require('gulp-uglify');</span><br><span class="line">const rename = require('gulp-rename');</span><br><span class="line"></span><br><span class="line">function srcBuild() {</span><br><span class="line"> return src("src/*.js")</span><br><span class="line"> .pipe(babel())</span><br><span class="line"> .pipe(dest('dist/'))</span><br><span class="line"> .pipe(uglify())</span><br><span class="line"> .pipe(rename({ extname: '.min.js' }))</span><br><span class="line"> .pipe(dest('dist/'));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">exports.default = srcBuild</span><br></pre></td></tr></table></figure></li>
<li>Run the build to test it works <code>npx gulp</code></li>
<li>Run the server to test it works <code>node dist/index.min.js</code></li>
</ul>
<h1 id="To-build-the-nodejs-app-into-a-docker-container"><a href="#To-build-the-nodejs-app-into-a-docker-container" class="headerlink" title="To build the nodejs app into a docker container"></a>To build the nodejs app into a docker container</h1><ul>
<li><p>Create a Dockerfile under the root of the project <code>touch Dockerfile</code> </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">FROM node:12 AS builder</span><br><span class="line">WORKDIR /build</span><br><span class="line">COPY . .</span><br><span class="line">RUN npm install && npx gulp</span><br><span class="line"></span><br><span class="line">FROM node:12</span><br><span class="line">WORKDIR /app</span><br><span class="line">COPY --from=builder /build/dist .</span><br><span class="line">ENTRYPOINT [ "node", "./index.js" ]</span><br></pre></td></tr></table></figure>
</li>
<li><p>Build the image <code>docker build . -t two-stages-docker-nodejs</code></p>
</li>
<li><p>Run the image <code>docker run -p 8080:8080 two-stages-docker</code></p>
</li>
<li><p>Access in the browser <a target="_blank" rel="noopener" href="http://localhost:8080/">http://localhost:8080</a></p>
</li>
</ul>
<h1 id="Related-tools"><a href="#Related-tools" class="headerlink" title="Related tools:"></a>Related tools:</h1><p>NPM<br>Gulp<br>Docker</p>
<h1 id="References"><a href="#References" class="headerlink" title="References:"></a>References:</h1><p><a target="_blank" rel="noopener" href="https://gulpjs.com/docs/en/getting-started/quick-start">https://gulpjs.com/docs/en/getting-started/quick-start</a><br><a target="_blank" rel="noopener" href="https://docs.npmjs.com/getting-started">https://docs.npmjs.com/getting-started</a><br><a target="_blank" rel="noopener" href="https://docs.docker.com/develop/develop-images/multistage-build/">https://docs.docker.com/develop/develop-images/multistage-build/</a></p>
<h1 id="Source-code"><a href="#Source-code" class="headerlink" title="Source code"></a>Source code</h1><p><a target="_blank" rel="noopener" href="https://github.com/kai-chu/PieceOfCodes/tree/master/two-stages-docker-build-nodejs-app">https://github.com/kai-chu/PieceOfCodes/tree/master/two-stages-docker-build-nodejs-app</a></p>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Docker/2021/04/18/Setup-two-stages-of-Dockerfile-to-build-a-nodejs-app.html" data-id="ckuelnhpq001wexoj84e6ecfr" class="article-share-link">Share</a>
<a href="https://kaichu.se/Docker/2021/04/18/Setup-two-stages-of-Dockerfile-to-build-a-nodejs-app.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Docker/" rel="tag">Docker</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Nodejs/" rel="tag">Nodejs</a></li></ul>
</footer>
</div>
</article>
<article id="post-Ansible-built-in-module-with-items" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Ansible/2020/09/23/Ansible-built-in-module-with-items.html" class="article-date">
<time datetime="2020-09-23T23:23:42.000Z" itemprop="datePublished">2020-09-23</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Ansible/">Ansible</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Ansible/2020/09/23/Ansible-built-in-module-with-items.html">Ansible built-in module - with_items</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p><img src="/assets/ansible.png" alt="ansible"></p>
<p>This is an demo to ansible built-in item, with_items</p>
<p>Before the topic, I want to remind of the <a target="_blank" rel="noopener" href="https://docs.ansible.com/ansible/latest/reference_appendices/YAMLSyntax.html#yaml-basics">yaml basic syntax</a> about list and dictionary.<br>To create a list in yaml, there are two forms(yaml or an abbreviated form) which are equivalent, and we can write both of those in the yaml file</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">listName:</span><br><span class="line">- 1</span><br><span class="line">- 2</span><br><span class="line">- 3</span><br><span class="line">//// which is the same as the abbreviated form</span><br><span class="line">listName: [1,2,3]</span><br></pre></td></tr></table></figure>
<p>And it’s the same for dictionary</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">- firstName: kai</span><br><span class="line"> lastName: chu</span><br><span class="line"> age: 29</span><br><span class="line"> phone: 888888</span><br><span class="line">//// which is the same as the abbreviated form </span><br><span class="line">{firstName: kai, lastName: chu, age: 29, phone: 888888}</span><br></pre></td></tr></table></figure>
<p>Keeping that in mind, it’s easier to understand different usages in different projects, regardless of mixed syntax playbooks written by the DevOps.</p>
<p>The following explaination will be similar as what have been given by the <a target="_blank" rel="noopener" href="https://docs.ansible.com/ansible/latest/collections/ansible/builtin/items_lookup.html">offical examples</a><br>If you have clearly understood the offical examples, then you don’t have to go further with this post.</p>
<p>This post gives examples about <em>list of values</em> and <em>list of dictionaries</em></p>
<p>In an ansible playbook, we can use with_items with a list of values, list of dictionaries or a variable, it can either yaml syntax or in an abbreviated form.</p>
<h2 id="4-forms-of-using-list-of-values"><a href="#4-forms-of-using-list-of-values" class="headerlink" title="4 forms of using list of values"></a>4 forms of using list of values</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line">---</span><br><span class="line">- name: >- </span><br><span class="line"> Demo ansible build-in withItems with list, </span><br><span class="line"> this lookup returns a list of items given to it, </span><br><span class="line"> if any of the top level items is also a list it will flatten it, </span><br><span class="line"> but it will not recurse</span><br><span class="line"> hosts: localhost</span><br><span class="line"> connection: local</span><br><span class="line"> vars:</span><br><span class="line"> list_in_var: </span><br><span class="line"> - green</span><br><span class="line"> - red</span><br><span class="line"> - blue</span><br><span class="line"> </span><br><span class="line"> list_in_var_as_abbreviated_form: [green, red, blue]</span><br><span class="line"></span><br><span class="line"> tasks:</span><br><span class="line"> - name: "[List of items - 01] items defined in the same playbook"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item: {{ item }}"</span><br><span class="line"> with_items:</span><br><span class="line"> - green</span><br><span class="line"> - red</span><br><span class="line"> - blue</span><br><span class="line"></span><br><span class="line"> - name: "[List of items - 02] items defined in a variable"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item: {{ item }}"</span><br><span class="line"> with_items: "{{ list_in_var }}"</span><br><span class="line"></span><br><span class="line"> - name: "[List of items - 03] items in an abbreviated form defined in the same playbook"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item: {{ item }}"</span><br><span class="line"> with_items: [green, red, blue]</span><br><span class="line"></span><br><span class="line"> - name: "[List of items - 04] items in an abbreviated form variable"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item: {{ item }}"</span><br><span class="line"> with_items: "{{list_in_var_as_abbreviated_form}}"</span><br></pre></td></tr></table></figure>
<p>The output: </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">$ ansible-playbook playbook.yaml </span><br><span class="line"></span><br><span class="line">PLAY [Demo ansible build-in withItems, this lookup returns a list of items given to it, if any of the top level items is also a list it will flatten it, but it will not recurse] ***</span><br><span class="line"></span><br><span class="line">TASK [Gathering Facts] *********************************************************</span><br><span class="line">ok: [localhost]</span><br><span class="line"></span><br><span class="line">TASK [[List of items - 01] items defined in the same playbook] *****************</span><br><span class="line">ok: [localhost] => (item=green) => {</span><br><span class="line"> "msg": "An item: green"</span><br><span class="line">}</span><br><span class="line">ok: [localhost] => (item=red) => {</span><br><span class="line"> "msg": "An item: red"</span><br><span class="line">}</span><br><span class="line">ok: [localhost] => (item=blue) => {</span><br><span class="line"> "msg": "An item: blue"</span><br><span class="line">}</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<h2 id="4-forms-of-using-list-of-dictionaries"><a href="#4-forms-of-using-list-of-dictionaries" class="headerlink" title="4 forms of using list of dictionaries"></a>4 forms of using list of dictionaries</h2><p>There is nothing special for dictionaries compared with list of values. The <em>item</em> will be a dictionary in this case and we can use item.key to access the value. </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line">---</span><br><span class="line">- name: >- </span><br><span class="line"> Demo ansible build-in with_items with list of dictionaries, </span><br><span class="line"> this lookup returns a list of items given to it, </span><br><span class="line"> if any of the top level items is also a list it will flatten it, </span><br><span class="line"> but it will not recurse</span><br><span class="line"> hosts: localhost</span><br><span class="line"> connection: local</span><br><span class="line"> vars:</span><br><span class="line"> list_of_dictionaries_in_var:</span><br><span class="line"> - name: Green</span><br><span class="line"> color: green</span><br><span class="line"> - name: Red</span><br><span class="line"> color: red</span><br><span class="line"> - name: Blue</span><br><span class="line"> color: blue</span><br><span class="line"></span><br><span class="line"> list_of_dictionaries_in_var_as_abbreviated_form:</span><br><span class="line"> - {name: Green, color: green}</span><br><span class="line"> - {name: Red, color: red}</span><br><span class="line"> - {name: Blue, color: blue}</span><br><span class="line"></span><br><span class="line"> tasks:</span><br><span class="line"> - name: "[list of dict items - 01] items defined in the same playbook"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item name: {{ item.name }}, color: {{ item.color }}"</span><br><span class="line"> with_items:</span><br><span class="line"> - name: Green</span><br><span class="line"> color: green</span><br><span class="line"> - name: Red</span><br><span class="line"> color: red</span><br><span class="line"> - name: Blue</span><br><span class="line"> color: blue</span><br><span class="line"></span><br><span class="line"> - name: "[list of dict items - 01] items defined in the same playbook"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item name: {{ item.name }}, color: {{ item.color }}"</span><br><span class="line"> with_items: </span><br><span class="line"> - { name: Green, color: green }</span><br><span class="line"> - { name: Red, color: red }</span><br><span class="line"> - { name: Blue, color: blue }</span><br><span class="line"></span><br><span class="line"> - name: "[List of dict items - 02] items defined in an variable"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item name: {{ item.name }}, color: {{ item.color }}"</span><br><span class="line"> with_items: "{{ list_of_dictionaries_in_var }}"</span><br><span class="line"></span><br><span class="line"> - name: "[List of dict items - 03] items defined in an abbreviated form variable"</span><br><span class="line"> debug:</span><br><span class="line"> msg: "An item name: {{ item.name }}, color: {{ item.color }}"</span><br><span class="line"> with_items: "{{list_of_dictionaries_in_var_as_abbreviated_form}}"</span><br></pre></td></tr></table></figure>
<p>The output: </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">$ ansible-playbook playbook-with-items-dict.yaml </span><br><span class="line"></span><br><span class="line">PLAY [Demo ansible build-in with_items with list of dictionaries, this lookup returns a list of items given to it, if any of the top level items is also a list it will flatten it, but it will not recurse] ***</span><br><span class="line"></span><br><span class="line">TASK [Gathering Facts] *********************************************************</span><br><span class="line">ok: [localhost]</span><br><span class="line"></span><br><span class="line">TASK [[list of dict items - 01] items defined in the same playbook] ************</span><br><span class="line">ok: [localhost] => (item={u'color': u'green', u'name': u'Green'}) => {</span><br><span class="line"> "msg": "An item name: Green, color: green"</span><br><span class="line">}</span><br><span class="line">ok: [localhost] => (item={u'color': u'red', u'name': u'Red'}) => {</span><br><span class="line"> "msg": "An item name: Red, color: red"</span><br><span class="line">}</span><br><span class="line">ok: [localhost] => (item={u'color': u'blue', u'name': u'Blue'}) => {</span><br><span class="line"> "msg": "An item name: Blue, color: blue"</span><br><span class="line">}</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<h2 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h2><p>I found the module with_items is really useful when it comes to adding a few configurations for a provision. It is much flexible when we put configurations as key values in a variable file, with with_items module in a playbook, we don’t have to change the playbook when we need to add a new item.</p>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Ansible/2020/09/23/Ansible-built-in-module-with-items.html" data-id="ckuelnhpp001texojbymq818p" class="article-share-link">Share</a>
<a href="https://kaichu.se/Ansible/2020/09/23/Ansible-built-in-module-with-items.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/ansible/" rel="tag">ansible</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/devops/" rel="tag">devops</a></li></ul>
</footer>
</div>
</article>
<article id="post-Kubernetes-02-the-3-practical-ways-to-use-k8s-secret" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Kubernetes/2020/09/21/Kubernetes-02-the-3-practical-ways-to-use-k8s-secret.html" class="article-date">
<time datetime="2020-09-21T22:07:33.000Z" itemprop="datePublished">2020-09-21</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Kubernetes/">Kubernetes</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Kubernetes/2020/09/21/Kubernetes-02-the-3-practical-ways-to-use-k8s-secret.html">Kubernetes - 02 - the 3 practical ways to use k8s secret</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p><img src="/assets/kubernetes.png" alt="kubernetes"></p>
<p>This is the second post about kubernetes secret, in the previous, I have list the 3 ways to create secrets. We can create as many secrets as we want. In this post, I will give the 3 practical ways to use the secrets in k8s deployment.</p>
<p>Let’s assume we have created a secret named my secret as following </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Secret</span><br><span class="line">metadata:</span><br><span class="line"> name: mysecret</span><br><span class="line">type: Opaque</span><br><span class="line">stringData:</span><br><span class="line"> username: admin</span><br><span class="line"> password: 1f2d1e2e67df</span><br></pre></td></tr></table></figure>
<h1 id="3-ways-to-use-k8s-secrets"><a href="#3-ways-to-use-k8s-secrets" class="headerlink" title="3 ways to use k8s secrets"></a>3 ways to use k8s secrets</h1><ul>
<li>As environments</li>
<li>As volume files</li>
<li>As Kubelet auth credentials to pull image</li>
</ul>
<h1 id="1-Used-as-environments"><a href="#1-Used-as-environments" class="headerlink" title="1 Used as environments"></a>1 Used as environments</h1><p>A secret is a dictionary, we can put it in a environment as key-values.<br>We can put the whole dictionary into the environment or we refer one of the keys and use its value</p>
<h2 id="1-1-Use-as-a-value-of-a-user-defined-env-var"><a href="#1-1-Use-as-a-value-of-a-user-defined-env-var" class="headerlink" title="1.1 Use as a value of a user defined env var"></a>1.1 Use as a value of a user defined env var</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line"> name: secret-env-pod</span><br><span class="line">spec:</span><br><span class="line">- containers</span><br><span class="line"> - name: container-name</span><br><span class="line"> env:</span><br><span class="line"> - name: PGDATA</span><br><span class="line"> value: /var/lib/postgresql/data/pgdata</span><br><span class="line"> - name: POSTGRES_USER</span><br><span class="line"> valueFrom:</span><br><span class="line"> secretKeyRef:</span><br><span class="line"> name: mysecret</span><br><span class="line"> key: username</span><br></pre></td></tr></table></figure>
<p>The environment variable <code>POSTGRES_USER</code> shall be the value <code>admin</code></p>
<h2 id="1-2-Use-keys-from-secret-directly-as-env-key-word-envFrom"><a href="#1-2-Use-keys-from-secret-directly-as-env-key-word-envFrom" class="headerlink" title="1.2 Use keys from secret directly as env, key word envFrom"></a>1.2 Use keys from secret directly as env, key word <code>envFrom</code></h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line"> name: secret-env-pod</span><br><span class="line">spec:</span><br><span class="line">- containers</span><br><span class="line"> - name: container-name</span><br><span class="line"> env:</span><br><span class="line"> - name: PGDATA</span><br><span class="line"> value: /var/lib/postgresql/data/pgdata</span><br><span class="line"> envFrom:</span><br><span class="line"> - secretRef:</span><br><span class="line"> name: test-secret</span><br></pre></td></tr></table></figure>
<p>The environment variable <code>username</code> and <code>password</code> shall be available in the container.</p>
<h1 id="2-Used-as-volume-files"><a href="#2-Used-as-volume-files" class="headerlink" title="2 Used as volume files"></a>2 Used as volume files</h1><p>We can use the secret keys to generate files in a volume, and mount it into a container. We have 2 keys in our secret, which means we will have two files (/username and /password) created in the volume.</p>
<h2 id="2-1-Mount-all-keys"><a href="#2-1-Mount-all-keys" class="headerlink" title="2.1 Mount all keys"></a>2.1 Mount all keys</h2><p>Two steps to mount a secret into a container.</p>
<h3 id="2-2-1-Create-a-volume-from-a-secret"><a href="#2-2-1-Create-a-volume-from-a-secret" class="headerlink" title="2.2.1 Create a volume from a secret"></a>2.2.1 Create a volume from a secret</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">volumes:</span><br><span class="line">- name: volume-from-secret</span><br><span class="line"> secret:</span><br><span class="line"> secretName: mysecret</span><br></pre></td></tr></table></figure>
<h3 id="2-2-2-Mount-the-volume-to-a-directory"><a href="#2-2-2-Mount-the-volume-to-a-directory" class="headerlink" title="2.2.2 Mount the volume to a directory"></a>2.2.2 Mount the volume to a directory</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">volumeMounts:</span><br><span class="line">- name: volume-from-secret</span><br><span class="line"> mountPath: "/etc/foo"</span><br><span class="line"> readOnly: true</span><br></pre></td></tr></table></figure>
<p>Put them together in a pod yaml file</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line"> name: mypod</span><br><span class="line">spec:</span><br><span class="line"> containers:</span><br><span class="line"> - name: mypod</span><br><span class="line"> image: redis</span><br><span class="line"> volumeMounts:</span><br><span class="line"> - name: foo</span><br><span class="line"> mountPath: "/etc/foo"</span><br><span class="line"> readOnly: true</span><br><span class="line"> volumes:</span><br><span class="line"> - name: foo</span><br><span class="line"> secret:</span><br><span class="line"> secretName: mysecret</span><br></pre></td></tr></table></figure>
<p>Since we mount the <code>secret volume</code> (I call a volume created from a secret) to /etc/foo, we can use the values from the files created from the secret keys.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$cat /etc/foo/username </span><br><span class="line">admin</span><br><span class="line">$cat /etc/foo/password </span><br><span class="line">1f2d1e2e67df</span><br></pre></td></tr></table></figure>
<h2 id="2-2-Mount-subset-of-the-secret-keys-to-user-defind-subfolder"><a href="#2-2-Mount-subset-of-the-secret-keys-to-user-defind-subfolder" class="headerlink" title="2.2 Mount subset of the secret keys to user defind subfolder"></a>2.2 Mount subset of the secret keys to user defind subfolder</h2><p>To select a specific key instead of mount all keys into the folder, we can add <code>items</code> when creating the <code>secret volume</code></p>
<h3 id="2-2-1-Create-a-volume-from-a-secret-1"><a href="#2-2-1-Create-a-volume-from-a-secret-1" class="headerlink" title="2.2.1 Create a volume from a secret"></a>2.2.1 Create a volume from a secret</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">volumes:</span><br><span class="line">- name: volume-from-secret</span><br><span class="line"> secret:</span><br><span class="line"> secretName: mysecret</span><br><span class="line"></span><br><span class="line"> items:</span><br><span class="line"> - key: username</span><br><span class="line"> path: my-group/my-username</span><br></pre></td></tr></table></figure>
<h3 id="2-2-2-Mount-the-volume-to-a-directory-which-is-the-same-as-above"><a href="#2-2-2-Mount-the-volume-to-a-directory-which-is-the-same-as-above" class="headerlink" title="2.2.2 Mount the volume to a directory(which is the same as above)"></a>2.2.2 Mount the volume to a directory(which is the same as above)</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line"> name: mypod</span><br><span class="line">spec:</span><br><span class="line"> containers:</span><br><span class="line"> - name: mypod</span><br><span class="line"> image: redis</span><br><span class="line"> volumeMounts:</span><br><span class="line"> - name: foo</span><br><span class="line"> mountPath: "/etc/foo"</span><br><span class="line"> readOnly: true</span><br><span class="line"> volumes:</span><br><span class="line"> - name: foo</span><br><span class="line"> secret:</span><br><span class="line"> secretName: mysecret</span><br><span class="line"> </span><br><span class="line"> </span><br><span class="line"> items:</span><br><span class="line"> - key: username</span><br><span class="line"> path: my-group/my-username</span><br></pre></td></tr></table></figure>
<p>Now you will find only username is projected.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$cat /etc/foo/my-group/my-username</span><br><span class="line">admin</span><br></pre></td></tr></table></figure>
<h2 id="2-3-File-mode"><a href="#2-3-File-mode" class="headerlink" title="2.3 File mode"></a>2.3 File mode</h2><p>0644 is used by default, which can be changed as following</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">volumes:</span><br><span class="line">- name: foo</span><br><span class="line"> secret:</span><br><span class="line"> secretName: mysecret</span><br><span class="line"> defaultMode: 0400</span><br></pre></td></tr></table></figure>
<h1 id="3-Kubelet-auth-to-pull-image"><a href="#3-Kubelet-auth-to-pull-image" class="headerlink" title="3 Kubelet auth to pull image"></a>3 Kubelet auth to pull image</h1><h2 id="3-1-Use-image-pull-secret-in-pod-spec"><a href="#3-1-Use-image-pull-secret-in-pod-spec" class="headerlink" title="3.1 Use image pull secret in pod spec"></a>3.1 Use image pull secret in pod spec</h2><p>When k8s is trying to pull image from image registry, it will check list of <code>docker-registry</code> secrets in <code>the same namespace</code> in the field <code>imagePullSecrets</code> from your pod yaml specification. </p>
<ul>
<li><p>To create a secret for docker authentication </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl create secret docker-registry myregistrykey --docker-server=DOCKER_REGISTRY_SERVER --docker-username=DOCKER_USER --docker-password=DOCKER_PASSWORD --docker-email=DOCKER_EMAIL</span><br></pre></td></tr></table></figure>
</li>
<li><p>Refer the secret in the pod spec</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line"> name: foo</span><br><span class="line"> namespace: awesomeapps</span><br><span class="line">spec:</span><br><span class="line"> containers:</span><br><span class="line"> - name: foo</span><br><span class="line"> image: YOUR_PRIVATE_DOCKER_IMAGE</span><br><span class="line"> imagePullSecrets:</span><br><span class="line"> - name: myregistrykey</span><br></pre></td></tr></table></figure>
</li>
</ul>
<h2 id="3-2-Use-image-pull-secret-in-pod-service-account"><a href="#3-2-Use-image-pull-secret-in-pod-service-account" class="headerlink" title="3.2 Use image pull secret in pod service account"></a>3.2 Use image pull secret in pod service account</h2><p>Since each pod will be associated with an service account, we can also add the <code>imagePullSecrets</code> to the service account<br>Usually if we don’t specify a service account when defining pod or deployment, the <code>default</code> service account is used in that case. </p>
<h3 id="3-2-1-Associate-a-secret-to-service-account"><a href="#3-2-1-Associate-a-secret-to-service-account" class="headerlink" title="3.2.1 Associate a secret to service account"></a>3.2.1 Associate a secret to service account</h3><p>To add a secret to a service account, add a field ‘imagePullSecrets’ to the sa spec.</p>
<ul>
<li><p>Patch an existing service account<br>We can patch the service account as following</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl patch serviceaccount default -p '{"imagePullSecrets": [{"name": "myregistrykey"}]}'</span><br></pre></td></tr></table></figure>
</li>
<li><p>Create a new one service account with imagePullSecrets</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: ServiceAccount</span><br><span class="line">metadata:</span><br><span class="line"> name: my-pod-used-service-account</span><br><span class="line"> namespace: default</span><br><span class="line"></span><br><span class="line">imagePullSecrets:</span><br><span class="line"> - name: myregistrykey</span><br></pre></td></tr></table></figure>
</li>
</ul>
<h3 id="3-2-2-Config-a-pod-to-use-the-service-account-field-‘serviceAccountName’-in-Pod-spec"><a href="#3-2-2-Config-a-pod-to-use-the-service-account-field-‘serviceAccountName’-in-Pod-spec" class="headerlink" title="3.2.2 Config a pod to use the service account, field ‘serviceAccountName’ in Pod spec"></a>3.2.2 Config a pod to use the service account, field ‘serviceAccountName’ in Pod spec</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line"> name: foo</span><br><span class="line"> namespace: awesomeapps</span><br><span class="line">spec:</span><br><span class="line"> containers:</span><br><span class="line"> - name: foo</span><br><span class="line"> image: YOUR_PRIVATE_DOCKER_IMAGE</span><br><span class="line"> serviceAccountName: my-pod-used-service-account</span><br></pre></td></tr></table></figure>
<p>When pod is created with service account <code>my-pod-used-service-account</code>, the imagePullSecrets will be added automatically in the spec, we can verify </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl get pod THE_POD_NAME -o=jsonpath='{.spec.imagePullSecrets[0].name}{"\n"}'</span><br></pre></td></tr></table></figure>
<p>Related:<br><a href="https://kaichu.se/Kubernetes/2020/09/19/kubernetes-01-the-3-practical-ways-to-create-k8s-secret.html">https://kaichu.se/Kubernetes/2020/09/19/kubernetes-01-the-3-practical-ways-to-create-k8s-secret.html</a></p>
<p>References:<br><a target="_blank" rel="noopener" href="https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-files-from-a-pod">https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-files-from-a-pod</a></p>
<p><a target="_blank" rel="noopener" href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/">https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/</a></p>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Kubernetes/2020/09/21/Kubernetes-02-the-3-practical-ways-to-use-k8s-secret.html" data-id="ckuelnhpn001qexojgg4kgfo6" class="article-share-link">Share</a>
<a href="https://kaichu.se/Kubernetes/2020/09/21/Kubernetes-02-the-3-practical-ways-to-use-k8s-secret.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/k8s/" rel="tag">k8s</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/secret/" rel="tag">secret</a></li></ul>
</footer>
</div>
</article>
<article id="post-kubernetes-01-the-3-practical-ways-to-create-k8s-secret" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Kubernetes/2020/09/19/kubernetes-01-the-3-practical-ways-to-create-k8s-secret.html" class="article-date">
<time datetime="2020-09-19T17:42:16.000Z" itemprop="datePublished">2020-09-19</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Kubernetes/">Kubernetes</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Kubernetes/2020/09/19/kubernetes-01-the-3-practical-ways-to-create-k8s-secret.html">Kubernetes - 01 - the 3 practical ways to create k8s secret</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p><img src="/assets/kubernetes.png" alt="kubernetes"><br>Just a summary about how to create k8s secret object, which is used to store a piece of sensitive information.</p>
<h1 id="String-data-and-Base64-encoding"><a href="#String-data-and-Base64-encoding" class="headerlink" title="String data and Base64 encoding"></a>String data and Base64 encoding</h1><p>An secret is saved as base64 encoded string, to generate a based64 string from your password in bash</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ echo -n "mypassword123" | base64 -w0</span><br></pre></td></tr></table></figure>
<p>To decode a base64 string</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ echo 'MWYyZDFlMmU2N2Rm' | base64 --decode</span><br></pre></td></tr></table></figure>
<p>Note: The serialized JSON and YAML values of Secret data are encoded as base64 strings. Newlines are not valid within these strings and must be omitted. When using the base64 utility on Darwin/macOS, users should avoid using the -b option to split long lines. Conversely, Linux users should add the option -w 0 to base64 commands or the pipeline base64 | tr -d ‘\n’ if the -w option is not available.</p>
<h1 id="3-ways-to-manage-secrets"><a href="#3-ways-to-manage-secrets" class="headerlink" title="3 ways to manage secrets"></a>3 ways to manage secrets</h1><p>There are 3 ways to use <a target="_blank" rel="noopener" href="https://kubernetes.io/docs/concepts/overview/working-with-objects/object-management/">kubectl cli</a>, the 3 corresponding ways to create secrets are as following.</p>
<h2 id="3-1-Imperative-commands-to-edit"><a href="#3-1-Imperative-commands-to-edit" class="headerlink" title="3.1 Imperative commands to edit"></a>3.1 Imperative commands to edit</h2><h3 id="3-1-1-Create-from-file"><a href="#3-1-1-Create-from-file" class="headerlink" title="3.1.1 Create from file"></a>3.1.1 Create from file</h3><ul>
<li>Generate base64 string to file<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ echo -n 'admin' > ./username.txt</span><br><span class="line">$ echo -n '1f2d1e2e67df' > ./password.txt</span><br></pre></td></tr></table></figure></li>
<li>Create secrets from file, the key of the secrets will be the filenames<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl create secret generic db-user-pass \</span><br><span class="line"> --from-file=./username.txt \</span><br><span class="line"> --from-file=./password.txt</span><br></pre></td></tr></table></figure>
To specify another names<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl create secret generic db-user-pass \</span><br><span class="line"> --from-file=username=./username.txt \</span><br><span class="line"> --from-file=password=./password.txt</span><br></pre></td></tr></table></figure>
</li>
</ul>
<h3 id="3-1-2-Create-from-literal"><a href="#3-1-2-Create-from-literal" class="headerlink" title="3.1.2 Create from literal"></a>3.1.2 Create from literal</h3><p>Literal escape with single quote (‘)</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">kubectl create secret generic dev-db-secret \</span><br><span class="line"> --from-literal=username=devuser \</span><br><span class="line"> --from-literal=password='S!B\*d$zDsb=' </span><br></pre></td></tr></table></figure>
<p>Note: To edit secret, command to use: <code>kubectl edit secrets dev-db-secret</code></p>
<h2 id="3-2-Imperative-object-files"><a href="#3-2-Imperative-object-files" class="headerlink" title="3.2 Imperative object files"></a>3.2 Imperative object files</h2><h3 id="3-2-1-Using-here-doc"><a href="#3-2-1-Using-here-doc" class="headerlink" title="3.2.1 Using here doc"></a>3.2.1 Using here doc</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">cat <<EOF | kubectl apply -f -</span><br><span class="line">apiVersion: v1</span><br><span class="line">kind: Secret</span><br><span class="line">metadata:</span><br><span class="line"> name: mysecret</span><br><span class="line">type: Opaque</span><br><span class="line">data:</span><br><span class="line"> password: $(echo -n "s33msi4" | base64 -w0)</span><br><span class="line"> username: $(echo -n "jane" | base64 -w0)</span><br><span class="line">EOF</span><br></pre></td></tr></table></figure>
<h3 id="3-2-2-yaml-File"><a href="#3-2-2-yaml-File" class="headerlink" title="3.2.2 yaml File"></a>3.2.2 yaml File</h3><p>which is the same as the following 2 commands and a yaml file</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">$echo -n 'admin' | base64</span><br><span class="line">$echo -n '1f2d1e2e67df' | base64</span><br><span class="line"></span><br><span class="line">//mysecret.yaml</span><br><span class="line">apiVersion: v1</span><br><span class="line">kind: Secret</span><br><span class="line">metadata:</span><br><span class="line"> name: mysecret</span><br><span class="line">type: Opaque</span><br><span class="line">data:</span><br><span class="line"> username: YWRtaW4=</span><br><span class="line"> password: MWYyZDFlMmU2N2Rm</span><br></pre></td></tr></table></figure>
<h3 id="3-2-3-string-data"><a href="#3-2-3-string-data" class="headerlink" title="3.2.3 string data"></a>3.2.3 string data</h3><p>The above is the same as following string data example, the string data will be encoded when k8s creates secret</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Secret</span><br><span class="line">metadata:</span><br><span class="line"> name: mysecret</span><br><span class="line">type: Opaque</span><br><span class="line">stringData:</span><br><span class="line"> username: admin</span><br><span class="line"> password: 1f2d1e2e67df</span><br></pre></td></tr></table></figure>
<p>Note: you can specify both data and stringdata in the same secret, the stringData will be used. I found this is useful if I want to encode a few lines of information</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: v1</span><br><span class="line">kind: Secret</span><br><span class="line">metadata:</span><br><span class="line"> name: mysecret</span><br><span class="line">type: Opaque</span><br><span class="line">data:</span><br><span class="line"> username: YWRtaW4=</span><br><span class="line"> password: MWYyZDFlMmU2N2Rm</span><br><span class="line">stringData:</span><br><span class="line"> username: admin</span><br><span class="line"> password: 1f2d1e2e67df</span><br></pre></td></tr></table></figure>
<p>The values from stringData will be used.</p>
<h2 id="3-3-Using-kustomization-yml-file-Declarative-object-configuration"><a href="#3-3-Using-kustomization-yml-file-Declarative-object-configuration" class="headerlink" title="3.3 Using kustomization.yml file, Declarative object configuration"></a>3.3 Using kustomization.yml file, Declarative object configuration</h2><p>To use <a target="_blank" rel="noopener" href="https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/">kustomization feature</a>, we need to create a folder first and add our file there</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ mkdir myconfigs</span><br><span class="line">$ touch kustomization.yaml</span><br></pre></td></tr></table></figure>
<h3 id="3-3-1-Generate-from-file"><a href="#3-3-1-Generate-from-file" class="headerlink" title="3.3.1 Generate from file"></a>3.3.1 Generate from file</h3><ul>
<li>Create base64 stirng <figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ echo -n 'admin' > ./username.txt</span><br><span class="line">$ echo -n '1f2d1e2e67df' > ./password.txt</span><br></pre></td></tr></table></figure></li>
<li>Add following generator to kustomization.yaml file<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">secretGenerator:</span><br><span class="line">- name: db-user-pass</span><br><span class="line"> files:</span><br><span class="line"> - username.txt</span><br><span class="line"> - password.txt</span><br></pre></td></tr></table></figure>
</li>
</ul>
<h3 id="3-3-2-Generate-from-literal"><a href="#3-3-2-Generate-from-literal" class="headerlink" title="3.3.2 Generate from literal"></a>3.3.2 Generate from literal</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">secretGenerator:</span><br><span class="line">- name: db-user-pass</span><br><span class="line"> literals:</span><br><span class="line"> - username=admin</span><br><span class="line"> - password=1f2d1e2e67df</span><br></pre></td></tr></table></figure>
<p>The next post will be <code>the 3 practical ways to use k8s secret</code></p>
<p>References:<br><a target="_blank" rel="noopener" href="https://kubernetes.io/docs/concepts/configuration/secret/">https://kubernetes.io/docs/concepts/configuration/secret/</a><br><a target="_blank" rel="noopener" href="https://kubernetes.io/docs/tasks/configmap-secret/managing-secret-using-kustomize/">https://kubernetes.io/docs/tasks/configmap-secret/managing-secret-using-kustomize/</a></p>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Kubernetes/2020/09/19/kubernetes-01-the-3-practical-ways-to-create-k8s-secret.html" data-id="ckuelnhpm001oexojef9yf9sm" class="article-share-link">Share</a>
<a href="https://kaichu.se/Kubernetes/2020/09/19/kubernetes-01-the-3-practical-ways-to-create-k8s-secret.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/k8s/" rel="tag">k8s</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/secret/" rel="tag">secret</a></li></ul>
</footer>
</div>
</article>
<article id="post-Airflow-variables-in-DAG" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Airflow/2020/08/26/Airflow-variables-in-DAG.html" class="article-date">
<time datetime="2020-08-26T22:53:59.000Z" itemprop="datePublished">2020-08-26</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Airflow/">Airflow</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Airflow/2020/08/26/Airflow-variables-in-DAG.html">Airflow variables in DAG</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>Variables are a generic way to store and retrieve arbitrary content or settings as a simple key value store within Airflow. </p>
<p>While your pipeline code definition and most of your constants and variables should be defined in code and stored in source control, it can be useful to have some variables or configuration items accessible and modifiable through the UI.</p>
<p>It can also be used as a context for different environments.</p>
<p>There are 3 ways to create variables</p>
<ul>
<li>UI</li>
<li>CLI</li>
<li>code</li>
</ul>
<h2 id="UI"><a href="#UI" class="headerlink" title="UI"></a>UI</h2><p>From the UI, we can navigate to Admin-Variables to manage.</p>
<h2 id="CLI"><a href="#CLI" class="headerlink" title="CLI"></a>CLI</h2><p>From the CLI, we can use following commands </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">airflow variables -g key</span><br><span class="line">airflow variables -s key value</span><br></pre></td></tr></table></figure>
<h2 id="Code"><a href="#Code" class="headerlink" title="Code"></a>Code</h2><p>No matter where we setup a variable, in the end we want to read variables in a DAG so that we can easily change the context of a DAG run.</p>
<p>There are two ways to read variables in a DAG</p>
<ul>
<li><p>Python Code</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">from airflow.models import Variable</span><br><span class="line">Variable.set("foo", "value")</span><br><span class="line">foo = Variable.get("foo")</span><br><span class="line">bar = Variable.get("bar", deserialize_json=True)</span><br><span class="line">baz = Variable.get("baz", default_var=None)</span><br></pre></td></tr></table></figure>
</li>
<li><p>Jinja template<br>You can use a variable from a jinja template with the syntax, such as a bash operator command: </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">echo {{ var.value.<variable_name> }}</span><br></pre></td></tr></table></figure>
<p>or if you need to deserialize a json object from the variable :</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">echo {{ var.json.<variable_name> }}</span><br></pre></td></tr></table></figure>
</li>
</ul>
<h2 id="Best-practice"><a href="#Best-practice" class="headerlink" title="Best practice"></a>Best practice</h2><p>You should avoid usage of Variables outside an operator’s execute() method or Jinja templates if possible, as Variables create a connection to metadata DB of Airflow to fetch the value, which can slow down parsing and place extra load on the DB.</p>
<p>Variables will create db connection every time scheduler parses a DAG</p>
<h2 id="Example-to-understand-best-practice"><a href="#Example-to-understand-best-practice" class="headerlink" title="Example to understand best practice"></a>Example to understand best practice</h2><ul>
<li><p>Let’s set variable env=dev from CLI</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ airflow variables -s env dev</span><br></pre></td></tr></table></figure>
</li>
<li><p>Create a DAG</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">from airflow.models import Variable</span><br><span class="line"></span><br><span class="line">env = Variable.get("env")</span><br><span class="line">print('' if env is None else env + 'parse time')</span><br><span class="line"></span><br><span class="line">with dag:</span><br><span class="line"> os_operator = PythonOperator(task_id = "os_operator", python_callable=print_env)</span><br><span class="line"> jinja_operator = BashOperator(task_id="get_variable_value", bash_command='echo {{ var.value.env }} ')</span><br><span class="line"></span><br></pre></td></tr></table></figure>
</li>
<li><p>Running explaination<br>When the scheduler parses the DAG, which shall happen every a few seconds, we will find <em>devparse time</em> in the log<br>When the DAG is scheduled, we will see bashoperator print the <em>dev</em> variables</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">//os_operator</span><br><span class="line">[2020-04-08 14:56:50,752] {{logging_mixin.py:112}} INFO - devexecution time</span><br><span class="line">...</span><br><span class="line">//get_variable_value</span><br><span class="line">[2020-04-08 14:56:59,133] {{bash_operator.py:115}} INFO - Running command: echo dev </span><br><span class="line">[2020-04-08 14:56:59,151] {{bash_operator.py:122}} INFO - Output:</span><br><span class="line">[2020-04-08 14:56:59,158] {{bash_operator.py:126}} INFO - dev</span><br></pre></td></tr></table></figure>
</li>
</ul>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Airflow/2020/08/26/Airflow-variables-in-DAG.html" data-id="ckuelnhpk001kexojdql9h2d7" class="article-share-link">Share</a>
<a href="https://kaichu.se/Airflow/2020/08/26/Airflow-variables-in-DAG.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Airflow/" rel="tag">Airflow</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Bigdata/" rel="tag">Bigdata</a></li></ul>
</footer>
</div>
</article>
<article id="post-Another-recommended-layout-to-use-profiles-in-an-ansible-playbook-project" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Ansible/2020/08/19/Another-recommended-layout-to-use-profiles-in-an-ansible-playbook-project.html" class="article-date">
<time datetime="2020-08-19T21:04:31.000Z" itemprop="datePublished">2020-08-19</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Ansible/">Ansible</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Ansible/2020/08/19/Another-recommended-layout-to-use-profiles-in-an-ansible-playbook-project.html">Another recommended layout to use profiles in an ansible-playbook project</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>Usually, I prefer to start a project from the recommended <a target="_blank" rel="noopener" href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_best_practices.html#alternative-directory-layout">best practice layout</a> in ansible official website.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line">inventories/</span><br><span class="line"> production/</span><br><span class="line"> hosts # inventory file for production servers</span><br><span class="line"> group_vars/</span><br><span class="line"> group1.yml # here we assign variables to particular groups</span><br><span class="line"> group2.yml</span><br><span class="line"> host_vars/</span><br><span class="line"> hostname1.yml # here we assign variables to particular systems</span><br><span class="line"> hostname2.yml</span><br><span class="line"></span><br><span class="line"> staging/</span><br><span class="line"> hosts # inventory file for staging environment</span><br><span class="line"> group_vars/</span><br><span class="line"> group1.yml # here we assign variables to particular groups</span><br><span class="line"> group2.yml</span><br><span class="line"> host_vars/</span><br><span class="line"> stagehost1.yml # here we assign variables to particular systems</span><br><span class="line"> stagehost2.yml</span><br><span class="line"></span><br><span class="line">library/</span><br><span class="line">module_utils/</span><br><span class="line">filter_plugins/</span><br><span class="line"></span><br><span class="line">site.yml</span><br><span class="line">webservers.yml</span><br><span class="line">dbservers.yml</span><br><span class="line"></span><br><span class="line">roles/</span><br><span class="line"> common/</span><br><span class="line"> webtier/</span><br><span class="line"> monitoring/</span><br><span class="line"> fooapp/</span><br></pre></td></tr></table></figure>
<p>It covers the essential mulitple environments deployment so that we can easily switch deployment by running following command in production or staging</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ ansible-playbook -i inventories/production webservers.yml -k -K --ask-vault-pass</span><br><span class="line">$ ansible-playbook -i inventories/staging webservers.yml -k -K --ask-vault-pass</span><br></pre></td></tr></table></figure>
<p>As I mentioned in my previous post <a href="https://kaichu.se/Ansible/2020/08/13/using-ansible-playbook-in-a-devops-pipeline.html">Using ansible playbook in a DevOps pipeline</a>, we could add an all.yml file in the playbook group_vars to provide following information to ansible-playbook to prevent from inputing password.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">ansible_user: YOUR_USER_NAME</span><br><span class="line">ansible_password: YOUR_USER_PASSWORD</span><br><span class="line">ansible_become_password: YOUR_BECOME_PASSWORD</span><br></pre></td></tr></table></figure>
<p>The group_vars in the root of the playbook is called <code>playbook group_vars</code></p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">inventories/</span><br><span class="line">group_vars</span><br><span class="line"> all.yml</span><br><span class="line">webservers.yml</span><br></pre></td></tr></table></figure>
<p>I feel it’s so inconvienient when I’m using my own user password instead of a shared service account between team members.<br>I don’t want tell others my vault password, in that case others can know my <code>ansible_password</code> and <code>ansible_become_password</code>.<br>Initially, I think I can create a template and everyone who wants to use the playbook should copy the project template and create their all.yml locally. It results in following project structure. </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">inventories/</span><br><span class="line">group_vars</span><br><span class="line"> .gitignore -> all.yml</span><br><span class="line"> all.yml.cfg</span><br><span class="line"> all.yml (Anyone who doesnt' want to use -k -K --ask-vault-password options can create this in this local machine)</span><br><span class="line">webservers.yml</span><br></pre></td></tr></table></figure>
<p>It turns out it’s even more cumbersome, obviously… </p>
<p>I find another better solution out, where we can use the –extra-vars options to achieve my goal without constraints.<br>I decide to use the profile concept which I’ve learnt from ant build scripts in my previous company.<br>Here we don’t use playbook group_vars, instead, we create a profiles folder and add the vars for each profile, such as kai, chu</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">inventories/</span><br><span class="line"> production/</span><br><span class="line"> staging/</span><br><span class="line">profiles/</span><br><span class="line"> template/</span><br><span class="line"> all.yml</span><br><span class="line"> kai/</span><br><span class="line"> all.yml</span><br><span class="line"> chu/</span><br><span class="line"> all.yml</span><br><span class="line">webservers.yml</span><br></pre></td></tr></table></figure>
<p>I have put <code>ansible_user</code>, <code>ansible_password</code> and <code>ansible_become_password</code> in the all.yml in folder <code>kai</code><br>Now we gain the benefit of the profile by running following command</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ansible-playbook -i inventories/production --extra-vars @profiles/kai/all.yml webservers.yml --vault-password-file ~/.ansible-vault-pass</span><br></pre></td></tr></table></figure>
<p>It is an env/profile matrix solution, it gives the flexibility to test our ansible-playbook with any favourate vars<br>Let’s run the playbook with chu’s profile in staging before finish this posts</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ansible-playbook -i inventories/staging --extra-vars @profiles/chu/all.yml webservers.yml --vault-password-file ~/.ansible-vault-pass</span><br></pre></td></tr></table></figure>
<h1 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h1><ul>
<li>It’s good to use –extra-vars when we have some variables setup which is the ansible playbook user related, in other words, the variables are different for different ansible user. </li>
<li>It would be more appropriate to add one more inventories/test if there are a lot environment related differences.</li>
</ul>
<p><img src="/assets/ansible-sible-book.png" alt="an sible book"><br><a target="_blank" rel="noopener" href="https://www.urbandictionary.com/define.php?term=Sible">Check out what sible means here in the urban dictionary</a></p>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Ansible/2020/08/19/Another-recommended-layout-to-use-profiles-in-an-ansible-playbook-project.html" data-id="ckuelnhpj001jexojgqc5f44d" class="article-share-link">Share</a>
<a href="https://kaichu.se/Ansible/2020/08/19/Another-recommended-layout-to-use-profiles-in-an-ansible-playbook-project.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Ansible/" rel="tag">Ansible</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/DevOps/" rel="tag">DevOps</a></li></ul>
</footer>
</div>
</article>
<article id="post-Airflow-start-date-with-cron-schedule-interval-is-not-confused-anymore-when-you-know-this" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/Airflow/2020/08/16/Airflow-start-date-with-cron-schedule-interval-is-not-confused-anymore-when-you-know-this.html" class="article-date">
<time datetime="2020-08-16T10:29:55.000Z" itemprop="datePublished">2020-08-16</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/Airflow/">Airflow</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/Airflow/2020/08/16/Airflow-start-date-with-cron-schedule-interval-is-not-confused-anymore-when-you-know-this.html">Airflow start_date with cron schedule_interval is not confused anymore when you know this</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>Airflow DAG start_date with days_ago is making us confused all the time. When a dag will be kick off? Will it be started?</p>
<blockquote>
<p>The first DAG Run is created based on the minimum start_date for the tasks in your DAG.<br>Subsequent DAG Runs are created by the scheduler process, based on your DAG’s schedule_interval, sequentially.</p>
</blockquote>
<p>The notes from airflow official website makes sense when you look at it in the first look, however, when you try to create you airflow DAG with a cron string, you never know what it means.<br>When my friend Yuxia comes to discuss about her case about running a dag every even day at 1 a.m., I thought it was so easy to do that.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">default_args = {</span><br><span class="line"> start_date = days_ago(1)</span><br><span class="line">}</span><br><span class="line">dag = DAG('demo-cron-schedule-interval', default_args = default_args, schedule_interval='0 1 2-30/2 * *', ...)</span><br></pre></td></tr></table></figure>
<ul>
<li>I can check the correctness with (Crontab guru)[<a target="_blank" rel="noopener" href="https://crontab.guru/#0_1_1-31/2_*_*]">https://crontab.guru/#0_1_1-31/2_*_*]</a><br>Today is 2020-08-14, initially from above quote, the start_date will be 2020-08-13 and the first DAG Run shall be created, but my cron says it shouldn’t create a DAG Run yesterday since it’s odd day. </li>
</ul>
<p>In fact, when she checked the system, there was not DAG Run started at 2020-08-14 01:00:00.</p>
<p>So what’s wrong with the quote? Why Yuxia’s DAG was not running?</p>
<p>Curiously, I checked the code logic in the <a target="_blank" rel="noopener" href="https://github.com/apache/airflow/blob/514eb6d1e350818b31dca5adeaec2d7fd32b23ee/airflow/jobs/scheduler_job.py#L537">scheduler_job</a><br>In this post, I’ll try to explain the outcome.</p>
<h1 id="What-scope"><a href="#What-scope" class="headerlink" title="What scope"></a>What scope</h1><p>A DAG has start_date <em>not set</em> as datetiem.timedelta, it could e.g. <code>dags_ago(1)</code><br>The start_date is set in default args <em>ONLY</em><br>A DAG is using cron string or preset as schedule_interval, <code>0 1 2-30/2 * *</code></p>
<h1 id="Issue-to-explain"><a href="#Issue-to-explain" class="headerlink" title="Issue to explain"></a>Issue to explain</h1><p>Will the first DAG Run be kicked off by airflow scheduler?</p>
<h1 id="Concepts-from-code"><a href="#Concepts-from-code" class="headerlink" title="Concepts from code"></a>Concepts from code</h1><ul>
<li><p>DAG start_date resolve, the scheduler is parsing the DAGs every 5 seconds (depending on setup).<br>Each time when the scheduler is running, it will calculate a start_date depending on current time(utcnow()).<br>days_ago(1) will be resolved as following. </p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">start_date = utcnow() - (1 day) and By default the time is set to midnight, e.g. day - 1 00:00:00</span><br></pre></td></tr></table></figure>
<blockquote>
<p>It’s very important to realise that <code>start_date + (1 day) != utcnow()</code></p>
</blockquote>
</li>
<li><p>DAG start_date adjustment, airflow will start subprocesses to create DAG Runs, it firstly checks the schedule_interval and calculate previous cron time(previous_cron), the further previous time(previous_pre_cron) and next cron time(next_cron) based on current time.<br>previous_pre_cron -> previous_cron -> utcnow() -> next_cron.<br><img src="/assets/airflow/start_date1.jpg" alt="The cron times"><br>The start_date of a DAG will be adjusted by the scheduler. In our scope, we can think the start_date will be adjusted as following rules.<br>It picks the later one from previous_pre_cron and the resolved start_date and update dag.start_date</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">dag.start_date = max(previous_pre_cron, dag.start_date)</span><br></pre></td></tr></table></figure></li>
<li><p>Normalize_schedule to next_run_date which is the execution date, which is named as <code>normalize_schedule</code> in the code logic. It is the adjusted start_date that will be normalized. The next_run_date will be DAG Run execution_date. It will try to align the start_date to one of the cron times.<br>For examples, cron times is <code>08-14 01:00:00</code> and <code>08-16 01:00:00</code>, any start_time in between, e.g. <code>08-15 00:00:00</code> shall be aligned to <code>08-16 01:00:00</code>. which means next cron time from the start date. If a start_time equals to a cron time, then the result will be the same. e.g. <code>normalize_schedule(08-14 01:00:00)=08-14 01:00:00</code></p>
</li>
<li><p>Period end<br>From <a target="_blank" rel="noopener" href="https://airflow.apache.org/docs/stable/faq.html">FAQ</a>, we know that <strong>Airflow scheduler triggers the task soon after the start_date + schedule_interval is passed</strong>, which I doult it results in confusion when it comes to cron schedule_interval context. </p>
<blockquote>
<p>From the code logic, I think it means the execution_date + schedule_interval. If you cron means every 2 days, then the schedule_interval shall be 2 day.</p>
</blockquote>
</li>
</ul>
<h1 id="Figure-out-when-a-dag-will-be-scheduled"><a href="#Figure-out-when-a-dag-will-be-scheduled" class="headerlink" title="Figure out when a dag will be scheduled"></a>Figure out when a dag will be scheduled</h1><p>To answer the question, we need to do 4 steps to get the result</p>
<ul>
<li>Cron time calculation, previous_pre_cron, previous_cron, next_cron</li>
<li>Resolve start_date</li>
<li>Adjust start_date to align with schedule_interval</li>
<li>Normalize adjusted start_date</li>
<li>Calcurate Period</li>
<li>Decide a DAG run</li>
</ul>
<p>Let’s assume some facts to continue a calculation example.</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cron is set: `0 1 2-30/2 * *` </span><br><span class="line">start_date: days_ago(1)</span><br><span class="line">today: 2020-08-14 </span><br></pre></td></tr></table></figure>
<ul>
<li>Calculate previous_pre_cron, previous_cron and next_cron time based on the time when the scheduler runs. Since it runs peridically, so those times probably are changing during the day. We can take 3 examples as following. </li>
</ul>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th>previous_pre_cron</th>
<th>previous_cron</th>
<th>next_cron</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td>08-10 01:00:00</td>
<td>08-12 01:00:00</td>
<td>08-14 01:00:00</td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td>08-12 01:00:00</td>
<td>08-14 01:00:00</td>
<td>08-16 01:00:00</td>
</tr>
<tr>
<td>08-15 08:00:00</td>
<td>08-12 01:00:00</td>
<td>08-14 01:00:00</td>
<td>08-16 01:00:00</td>
</tr>
</tbody></table>
<ul>
<li><p>Resolve start_date<br>Calculate the start_date based on the time when the scheduler runs, it changes as well when given config such as days_ago(1).<br>It will have the same start_date during different scheduler times in a day. They all have the mid night of previous day as you can see as folowing.</p>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th>start_date</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td>08-13 00:00:00</td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td>08-13 00:00:00</td>
</tr>
<tr>
<td>08-15 08:00:00</td>
<td>08-14 00:00:00</td>
</tr>
</tbody></table>
</li>
<li><p>Adjust start_date to align with schedule_interval<br>As discussed above, we compare the start_date with previous-pre cron to get the real start_date. The bigger one wins!</p>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th><strong>previous_pre_cron</strong></th>
<th>previous_cron</th>
<th>next_cron</th>
<th><strong>start_date</strong></th>
<th>adjusted_start</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td><strong>08-10 01:00:00</strong></td>
<td>08-12 01:00:00</td>
<td>08-14 01:00:00</td>
<td><span style="color:green"><strong>08-13 00:00:00</strong></span></td>
<td><span style="color:green">08-13 00:00:00</span></td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td><strong>08-12 01:00:00</strong></td>
<td>08-14 01:00:00</td>
<td>08-16 01:00:00</td>
<td><span style="color:green"><strong>08-13 00:00:00</strong></span></td>
<td><span style="color:green">08-13 00:00:00</span></td>
</tr>
<tr>
<td>08-15 08:00:00</td>
<td><strong>08-12 01:00:00</strong></td>
<td>08-14 01:00:00</td>
<td>08-16 01:00:00</td>
<td><span style="color:green"><strong>08-14 00:00:00</strong></span></td>
<td><span style="color:green">08-14 00:00:00</span></td>
</tr>
</tbody></table>
</li>
<li><p>Normalize the adjusted start_date to find possible execution_date.<br>Nomalize start_date(execution_date) is calculated by two steps, </p>
<ul>
<li>Find the next cron time (next_cron(adjusted_start)) and pre cron time (pre_cron(adjusted_start)) based on the <em>adjusted_start_date</em>. (which is different from now())</li>
<li>Compare to normalize</li>
</ul>
</li>
</ul>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">nomalize(adjusted_start) = adjusted_start == pre_cron(adjusted_start) ? pre_cron(adjusted_start) : next_cron(adjusted_start)`</span><br></pre></td></tr></table></figure>
<table>
<thead>
<tr>
<th>adjusted_start</th>
<th>pre_cron(adjusted_start)</th>
<th>next_cron(adjusted_start)</th>
<th>nomalize(adjusted_start)</th>
</tr>
</thead>
<tbody><tr>
<td>08-13 00:00:00</td>
<td>08-12 01:00:00</td>
<td><span style="color:green">08-14 01:00:00</span></td>
<td><span style="color:green">08-14 01:00:00</span></td>
</tr>
<tr>
<td>08-13 00:00:00</td>
<td>08-12 01:00:00</td>
<td><span style="color:green">08-14 01:00:00</span></td>
<td><span style="color:green">08-14 01:00:00</span></td>
</tr>
<tr>
<td>08-14 00:00:00</td>
<td>08-14 01:00:00</td>
<td><span style="color:green">08-16 01:00:00</span></td>
<td><span style="color:green">08-16 01:00:00</span></td>
</tr>
</tbody></table>
<ul>
<li><p>Period end<br>It’s easier to get period end from the normalized start date</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Period end = nomalize(adjusted_start) + schedule_interval</span><br></pre></td></tr></table></figure>
<table>
<thead>
<tr>
<th>adjusted_start</th>
<th>nomalize(adjusted_start)</th>
<th>Period end</th>
</tr>
</thead>
<tbody><tr>
<td>08-13 00:00:00</td>
<td><strong>08-14 01:00:00</strong></td>
<td><strong>08-16 01:00:00</strong></td>
</tr>
<tr>
<td>08-13 00:00:00</td>
<td><strong>08-14 01:00:00</strong></td>
<td><strong>08-16 01:00:00</strong></td>
</tr>
<tr>
<td>08-14 00:00:00</td>
<td><strong>08-16 01:00:00</strong></td>
<td><strong>08-18 01:00:00</strong></td>
</tr>
</tbody></table>
</li>
<li><p>Decide if a run will be started<br>We need to compare the normalized start date and period end with the current time again, if either one is later then now(), then scheduler won’t create a DAG Run</p>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th>adjusted_start</th>
<th>nomalize(adjusted_start)</th>
<th>Period end</th>
<th>DAG Run?</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td>08-13 00:00:00</td>
<td><span style="color:red">08-14 01:00:00</span></td>
<td>08-16 01:00:00</td>
<td>no</td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td>08-13 00:00:00</td>
<td>08-14 01:00:00</td>
<td><span style="color:red">08-16 01:00:00</span></td>
<td>no</td>
</tr>
<tr>
<td>08-15 08:00:00</td>
<td>08-14 00:00:00</td>
<td><span style="color:red">08-16 01:00:00</span></td>
<td>08-18 01:00:00</td>
<td>no</td>
</tr>
</tbody></table>
</li>
</ul>
<p>Let’s take the same example above, however, change the start_date to <code>days_ago(2)</code></p>
<ul>
<li><p>Cron times and adjusted_start</p>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th>previous_pre_cron</th>
<th>previous_cron</th>
<th>next_cron</th>
<th>start_date</th>
<th>adjusted_start</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td>08-10 01:00:00</td>
<td><strong>08-12 01:00:00</strong></td>
<td>08-14 01:00:00</td>
<td>08-12 00:00:00</td>
<td><strong>08-12 00:00:00</strong></td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td><strong>08-12 01:00:00</strong></td>
<td>08-14 01:00:00</td>
<td>08-16 01:00:00</td>
<td>08-12 00:00:00</td>
<td><strong>08-12 01:00:00</strong></td>
</tr>
</tbody></table>
</li>
<li><p>normalize</p>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th>adjusted_start</th>
<th>pre_cron(adjusted_start)</th>
<th>next_cron(adjusted_start)</th>
<th>nomalize(adjusted_start)</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td>08-12 00:00:00</td>
<td>08-10 01:00:00</td>
<td>08-12 01:00:00</td>
<td>08-12 01:00:00</td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td>08-12 01:00:00</td>
<td>08-12 01:00:00</td>
<td>08-14 01:00:00</td>
<td>08-12 01:00:00</td>
</tr>
</tbody></table>
</li>
<li><p>Period end and decision </p>
<table>
<thead>
<tr>
<th>scheduler time</th>
<th>nomalize(adjusted_start)</th>
<th>Period end</th>
<th>Dag Run?</th>
</tr>
</thead>
<tbody><tr>
<td>08-14 00:30:00</td>
<td>08-12 01:00:00</td>
<td><span style="color:red">08-14 01:00:00</span></td>
<td><span style="color:red">no</span></td>
</tr>
<tr>
<td>08-14 02:00:00</td>
<td>08-12 01:00:00</td>
<td><span style="color:green">08-14 01:00:00</span></td>
<td><span style="color:green">yes</span></td>
</tr>
</tbody></table>
</li>
</ul>
<h1 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h1><ul>
<li><p>The first DAG Run is created based on the minimum start_date for the tasks in your DAG.<br>It says <em>based on</em>, which doesn’t mean it will run the DAG at start_date.</p>
</li>
<li><p>Airflow scheduler triggers the task soon after the start_date + schedule_interval is passed<br>The start_date doesn’t mean the start_date you put in the default_args, In fact, it doesn’t mean any start_date, when the schedule interval is cron job.<br>It means the normalized-adjusted-and-resolved start_date that you give.</p>
</li>
<li><p>Will a DAG Run be started?<br>If we want to make sure a DAG Run started in a specific day(2020-08-14). When we think about airflow scheduler is runing for that day from (08-14 00:00:00 to 08-14 23:59:59), the start_date resolved from days_ago(2) is actually fixed (2020-08-12 00:00:00). It makes things easier to make sure a DAG Run triggered.<br><img src="/assets/airflow/airflow_start_date_duration.jpg" alt="The start_date"></p>
</li>
</ul>
<p><strong>The simple rules</strong> is to setup the number in days_ago(number_of_days) the same as or larger than your interval in your cron. e.g. if cron is saying every 2 days, then start_date is days_ago(2).</p>
<ul>
<li>More<br>Once a DAG Run is triggered, the start date is not that important anymore. The sub sequential run will be calculated from previous DAG Run execution date, which is already normalized and fixed date. </li>
</ul>
</div>
<footer class="article-footer">
<a data-url="https://kaichu.se/Airflow/2020/08/16/Airflow-start-date-with-cron-schedule-interval-is-not-confused-anymore-when-you-know-this.html" data-id="ckuelnhph001eexoj51dicory" class="article-share-link">Share</a>
<a href="https://kaichu.se/Airflow/2020/08/16/Airflow-start-date-with-cron-schedule-interval-is-not-confused-anymore-when-you-know-this.html#disqus_thread" class="article-comment-link">Comments</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Airflow/" rel="tag">Airflow</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Bigdata/" rel="tag">Bigdata</a></li></ul>
</footer>
</div>
</article>