in-defense-of-perl.org
1	https://
2 gopher://
3 #+AUTHOR: Simon Watson
4 #+TITLE: In Defense of Perl
5
6 * Preamble
7
8 I've wanted to blog about Perl for a while now. I've had this conversation
9 with quite a few friends, peers, and co-workers over the years and I can never
10 seem to fully get to the bottom of it.
11
12 Perl has obviously lost a lot of ground in recent decades, and is not really
13 what I would consider a popular language anymore. I think there are a few
14 reasons for this, and they're not unfair reasons.
15
16 With that said, in my experience Perl seems to have a very bad reputation. Not
17 only is it not popular, people are often offended by it. It seems to provoke
18 strong reactions.
19
20 Languages like Ruby and PHP are often called "dead" and "not modern", and seem
21 to have many people calling out their usage. With that said both of these langs
22 seem to be thriving with strong communities, and for every person that nay-says
23 their use, there is another in the discussion celebrating them.
24
25 This doesn't seem to be the case with Perl in my experience.
26
27 Below I'd like to "defend" Perl a bit, and show why it still has a place in the
28 modern computing landscape. In doing so I will try to be clear about my own biases
29 and forthcoming about Perl's many issues.
30
31 Disclaimer/Note: Most of the comparisons I make in this article are between Perl
32 and Python. Not because Python is a bad language, but because Python is commonly
33 the language that's suggested as the "better" alternative to writing
34 something in Perl.
35
36 ** My History and biases
37
38 Skip this section if you're more interested in the arguments I present on Perl's
39 behalf. This section is to give a litte background and to try and enumerate my
40 biases in favor of Perl.
41 ______________________________________________________________________________
42
43 *** Bias #1
44
45 Perl was the first language I felt I had learned to a pretty complete level. I'm
46 always hesisitant to say I'm an expert in anything, but if I was ever an expert (or
47 close to it) in one area, it's probably Perl programming and syntax.
48
49 I think this immediately creates a bias in my head that Perl is a good language. I
50 think this kind of bias is pretty common when trying to talk about programming languages
51 objectively.
52
53 - I'm very familiar with it so it /feels/ easy
54 - I know all the standard patterns so I reach for it a lot, and
55 it's /very/ quick for me to write relative to other languages;
56 I often use it for prototyping even if I end up rewriting in
57 another lang later
58
59 *** Bias #2
60
61 I'm a sysadmin by trade, not a software engineer. Despite that, I've had to write
62 and maintain software (especially things like tooling) many times in my career.
63
64 I mention this because I think Perl favors this profession more so than strict
65 software engineering jobs. I'll touch more on this later, but wanted to mention
66 this here.
67
68 *** Bias #3
69
70 I very rarely need to deploy or write software for anything other than a Unix
71 environment, and even then 95% of the time, it's for a Linux environment. Again,
72 I'll cover this more later, but my experience and arguments are heavily biased
73 towards the /Unix/ Perl programming experience. Not only is this a bias, it's
74 a disclaimer of sorts, as I won't really be covering the non-Unix system Perl
75 programming experience.
76
77 ** Why Perl is "bad"
78
79 Before I get into my arguments for using Perl today, it's important to cover some
80 of it's downsides and talk a bit about why I believe it's been relatively abandoned
81 by the modern software development community.
82
83 I think one of the biggest reasons right off the bat that Perl has fallen by the wayside is
84 the Perl5/Perl6 debacle. Others have covered this pretty extensively, so I won't
85 belabor the point, but in essence I think the effort to "modernize" Perl, and
86 try and make it into a language that non-Perl users could love, while still keeping
87 Perl5 users happy, was too tall of an order. The spiral that came out of it fragmented
88 the community and userbase, and gave the inititive to Python.
89
90 The Perl wikipedia page has some decent coverage of the various Perl5/6/7 lineages:
91 https://en.wikipedia.org/wiki/Perl#Raku_(Perl_6)
92
93 ( For some more interesting reading/background, see:
94 https://en.wikipedia.org/wiki/Outline_of_Perl )
95
96 Secondly, the syntax. I will make some arguments for Perl's syntax later, but I'll take a
97 brief moment here to acknowledge that it can look esoteric on a good day, and down right
98 illegiable on a bad one. The heavy use of sigils is something I will aim to cast as a
99 positive later on, but I will admit that needing to memorize and have an awareness of
100 different context sensative sigils can make code look "messy" or hard to deciper. More
101 on this later.
102
103 Lastly, and by and large the argument I'm most likely to hear against using Perl:
104
105 "No one uses it."
106
107 In this blog post I hope to address these arguments and others, with
108 concrete and constructive counter points.
109
110 My aim in writing this is _not to convince people to program in Perl._ It's to convince
111 people that _Perl is perfectly fine language to use_ for many different problem
112 areas -- it's to show that it may in fact be the /better/ choice for some problem areas.
113
114 I hope the distinction is clear and that I can convince you!
115
116 * Addressing Arguments
117
118 Preamble out of the way, I'll get right down to brass tacks.
119
120 ** Perl Syntax
121
122 *** Examples
123 People often talk about how Perl is completely unreadable and "write only". This can
124 be true, but I think it can be true for /any/ language, and as such doesn't really feel
125 like a valid criticism.
126
127 With that said, lets explore it a bit.
128
129 Let's start with something basic like making a hash using two arrays:
130
131 #+BEGIN_SRC perl
132 #!/usr/bin/perl
133
134 my @keys = ("a", "b", "c");
135 my @vals = (1, 2, 3);
136 my %hash;
137 @hash{@keys} = @vals;
138
139 # Output:
140 # perl ar2h.pl
141 # a : 1
142 # b : 2
143 # c : 3
144 #+END_SRC
145
146 #+BEGIN_SRC python
147 keys = ['a', 'b', 'c']
148 values = [1, 2, 3]
149 hash = {key: value for key, value in zip(keys, values)}
150 print(hash)
151
152 # Output:
153 # python3 ar2h.py
154 # {'a': 1, 'b': 2, 'c': 3}
155 #+END_SRC
156
157 For those unfamiliar with Perl's syntax, I'll break down briefly what's happening here:
158
159 We have two arrays with data in them, and an empty hash. Hashes in Perl are denoted by the '%'
160 symbol, arrays the '@' symbol.
161
162 By addressing the hash '%hash' with the '@' symbol, we are essentially addressing one dimention of
163 the hash. This 'syntax sugar' gives us an extremely ergonomic way to reason about how data assignment
164 is working in the assignment line.
165
166 We're taking the hash 'hash' and assigning the array 'keys' to it's first dimenstion, and the array
167 'vals' and assigning it to it's second dimenstion. Because arrays are ordered, this mapping is
168 intuituve and predictable.
169
170 There's nothing terrible about the example Python code to me, but the idea that it's intrinsically
171 more readable doesn't ring true for me, it's just different, and presupposes that you understand
172 it's assignment syntax in the way Perl presupposes you understand it's sigils and tokens. To reiterate,
173 Python's syntax is no better or worse than Perl's in this case -- it's just different. The programmer
174 may have preferences for one or the other, but I don't think an argument can be made that one or the
175 other is objectively better.
176
177 With a simple example out of the way, I'm going to provide a code example from some code I wrote in
178 the past month:
179
180 #+BEGIN_SRC perl
181 if ( $log_file_path =~ m/(\d{4}-\d{2}-\d{2}).log/ ) {
182 my $log_date = $1;
183 $log_date =~ tr/-//d;
184 if ( $log_date < $LATEST_DATE ) {
185 next;
186 } else {
187 my ($serial, $parsed_log_ref) = parse_log($log_file_path, \&json_line_parser);
188 my $output_file_path = $PROCESSED_LOG_DIR_PATH . "/" . $serial;
189 write_parsed_log_array($output_file_path, $parsed_log_ref);
190 }
191 } else {
192 die "Couldn't match log date in &process_seat_dir, exiting...\n";
193 }
194 #+END_SRC
195
196 This is a code path taken frequently in a log parser I wrote from something at my job.
197
198 Let's walk through the code and break it down in plain English. Feel free to skip if it's self evident:
199
200 Enter the =if= block if the variable =$log_file_path= matches a regex that looks something like =$YEAR-$MONTH-$DAY.log=.
201
202 Upon entering the block, capture the first regex capture group (enclosed in =()= in the regex) into a variable,
203 =$log_date=.
204
205 Use the Perl built in =tr()= to remove any =-= chars from the string.
206
207 Compare the resulting string (something that looks like =$YEAR$MONTH$DAY=) to a variable we set elsewhere in
208 the function scope, skipping the next code block if it's lower than =$LATEST_DATE=
209
210 Assign the return of =parse_log=
211
212 =parse_log()= expects to be passed two arguments: a string, and a function reference,
213 it returns two variables: a string, and a reference to an array, which represents an ordered list of
214 the lines in a file.
215
216 We assign these two returns into variables called =$serial= and =$parsed_log_ref=.
217
218 Construct a path name via the Perl built in string concat ( =.= ) and assign it to =$output_file_path=.
219
220 Finally, call a function that will flatten and write out the array of log lines to a file.
221
222 End syntax explanation.
223 ____________________________________________________________
224
225 I think there are two potentially tricky Perl syntax-isms in the above snippet.
226
227 Firstly, the data type of =$parsed_log_ref= is completely opaque. If you don't have insight into what =parse_log=
228 is returning, you have no idea that =$parsed_log_ref= is an array reference. Strongly typed languages obviously
229 solve this kind of problem for you, but I think in the domain of dynamic languages, this is a common problem
230 that comes with the territory. To my knowledge Python or Ruby doesn't have great answers for this (please feel
231 free to correct me on this).
232
233 Secondly, unless you're familiar with perls tokens, it's unclear what =\&json_line_parser= is. I think this kind of
234 notation can actually be a /plus/ for Perl.
235
236 If I am passing some data to a function by reference, it's pretty clear what that data is (assuming it's not
237 encapsulated in a scalar like the aforementioned =$parsed_log_ref= example):
238
239 \@array
240 \%hash
241 \&function
242 \$scalar
243
244 For me personally, being able to denote type at a glance can be useful, as opposed to bare words in languages like python,
245 where lots of the time it's up to me as the reader to understand all the surrounding context in order to know what type
246 a variable is. I realize that in Python you can sometimes use ={}= or =[]= for type hints
247
248 As mentioned above, Perl has this issue as well to an extent, but I think to a lesser extent than dynamically typed
249 languages that don't denote type with any kind of special syntax.
250
251 I think though that it's difficult to make an _objective_ argument in this regard, so...moving on.
252
253 *** Perlvars and Perl Magic
254
255 A friend brought up another interesting point with the last code snippet, the use of the Perl built-in =$1= var.
256
257 To me there seems to be some difficulty in making _objective_ arguments around syntax preferences, but in
258 discussing this article with a friend, there are maybe _objective_ arguments to make against
259 some of the abstractions Perl provides the user.
260
261 In the context of the previous code snippet, I'm using =m//= and regex capture groups to assign a variable:
262 #+BEGIN_SRC perl
263 if ( $log_file_path =~ m/(\d{4}-\d{2}-\d{2}).log/ ) {
264 my $log_date = $1;
265 # ...
266 #+END_SRC
267
268 My friends argument against this kind of magic assignment was that it breaks well understood mental models of
269 how programs operate: You can't use a variable you haven't defined.
270
271 In the greater context of the program this snippet comes from, =$1= is never assigned anywhere,
272 it's provided to me via PCRE.
273
274 I was able to be convinced this kind of behavior is more harmful than the way Python handles this problem
275 ( =re.match= /etc) because it forces the reader, who may be familiar with many other languages, but not
276 Perl, to understand Perl specific implementation details.
277
278 This is a very valid criticism I think, and as such this leads to the obvious question of what benefit
279 do you get from this complexity? I think, in a word: brevity.
280
281 * Perl is /fast/
282
283 You're right. It's probably not as fast as C, but below I will try and show that for a lot of cases, Perl is
284 much faster than it's main competition, Python. Particularly in certain domains.
285
286 Note: I know bench marking can be delicate and if not handled carefully produce poor data and poorer conclusions.
287 I've tried to be as fair and accurate as possible in these comparisions, and I'm making an effort to act in
288 good faith. If you believe I've made a mistake or that there is a faster way to do something, please let me know.
289
290 In the basic example below, generate a file with 1,000,000 newline separated 5 char strings:
291
292 Perl:
293 #+BEGIN_SRC perl
294 #!/usr/bin/perl
295 # perl --version | head -n2 | tail -n
296 # This is perl 5, version 34, subversion 0 (v5.34.0) built for x86_64-linux
297
298 use strict;
299 use warnings;
300
301 my @chars = ( "A".."Z" );
302 foreach ( 1..1000000 ) {
303 my $string = "";
304 foreach ( 1..5 ) {
305 $string = $string . $chars[ rand @chars ];
306 }
307 print("$string\n");
308 }
309 #+END_SRC
310
311 Result:
312 #+BEGIN_EXAMPLE
313 /tmp/tmp.G5LzZmzDQq λ time ./gen_words.pl > output.txt
314
315 real 0m1.065s
316 user 0m1.031s
317 sys 0m0.003s
318 #+END_EXAMPLE
319
320 Python:
321 #+BEGIN_SRC python
322 #python -V
323 #Python 3.10.2
324 import string
325 import random
326
327 word_list = list(string.ascii_uppercase)
328 print(random.choice(word_list))
329
330 for x in range(0,1000000):
331 string = ""
332 for y in range(0,5):
333 string = string + random.choice(word_list)
334
335 print(string);
336
337 #+END_SRC
338
339 Result:
340 #+BEGIN_EXAMPLE
341 /tmp/tmp.G5LzZmzDQq λ time python gen_words.py > output2.txt
342
343 real 0m3.986s
344 user 0m3.968s
345 sys 0m0.011s
346 #+END_EXAMPLE
347
348 Python ends up being almost 4x slower here.
349
350 In the basic example below, given a file of 1,000,000 newline separated 5 char strings, return how many are valid
351 words.
352
353 #+BEGIN_SRC perl
354 #!/usr/bin/perl
355
356 use strict;
357 use warnings;
358
359 foreach {
360 }
361
362
363 #+END_SRC
364
365 * There is no better Unix glue
366
367 - Here talk about Perl's "best" use case, as a glue language
368 for processing text streams and/or unstructured text data
369
370 * Lesser known Perl features
371
372 - Perl magic goes here
373
374 * Feedback/Topics/Notes To cover
375
376 - Top 10 things not to do in Python code
377 - People prefer python as more people know it
378 - Perception python stdlib is more complete
379 - People like Perl for it's portability
380 - People like Perl for text generation/report generation
381 - People like perl for it's use of one liners
382 - Cover "higher order perl"
383 - Perl's history as a "sysadmin lang" re: Larry Wall/Randal Schwartz
384 - Perl is more like Lisp and it is like C, and this is an important
385 distinction
386 - Talk about higher order functions
387 - Talk about string function references
388 - Talk about how Perl is /faster/ than Python in most text stream
389 processing cases (prove this!)
390 - Not welcoming to new comers
391 -