scribu 2017-05-18T04:03:47+03:00 http://scribu.net scribu mail@scribu.net http://scribu.net/blog/asynchronous-http-requests-in-python-3.5 Asynchronous HTTP Requests in Python 3.5+ 2016-11-11T00:00:00+02:00 scribu http://scribu.net <p>So you’ve heard that Python now supports that fancy <code class="highlighter-rouge">async/await</code> syntax. You want to play with it, but <a href="http://lucumr.pocoo.org/2016/10/30/i-dont-understand-asyncio/">asyncio seems intimidating</a>.</p> <p>Well, someone wrote a simpler alternative to asyncio. It’s called <a href="https://curio.readthedocs.io/">Curio</a> and people are saying good things <a href="https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/">about it</a>. <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p> <p>In this tutorial, I’m going to show you how to make non-blocking HTTP requests using Curio.</p> <p>Since it doesn’t have a high-level HTTP client yet, I whipped up a small library called <a href="https://github.com/scribu/curio-http">curio-http</a> <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>, so you’ll need to install that as well.</p> <h3 id="the-syntax">The syntax</h3> <p>Let’s start with a single request:</p> <div class="highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">curio</span> <span class="kn">import</span> <span class="nn">curio_http</span> <span class="n">async</span> <span class="k">def</span> <span class="nf">main</span><span class="p">():</span> <span class="n">async</span> <span class="k">with</span> <span class="n">curio_http</span><span class="o">.</span><span class="n">ClientSession</span><span class="p">()</span> <span class="k">as</span> <span class="n">session</span><span class="p">:</span> <span class="n">response</span> <span class="o">=</span> <span class="n">await</span> <span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">'https://httpbin.org/get'</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="s">'Status code:'</span><span class="p">,</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span><span class="p">)</span> <span class="n">content</span> <span class="o">=</span> <span class="n">await</span> <span class="n">response</span><span class="o">.</span><span class="n">json</span><span class="p">()</span> <span class="k">print</span><span class="p">(</span><span class="s">'Content:'</span><span class="p">,</span> <span class="n">content</span><span class="p">)</span> <span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span> <span class="n">curio</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">main</span><span class="p">())</span> </code></pre> </div> <p>You use <code class="highlighter-rouge">async def</code> to declare what’s called a <em>coroutine</em>. The last line — <code class="highlighter-rouge">curio.run(main())</code> — kicks off the coroutine.</p> <p>What’s inside the <code class="highlighter-rouge">main</code> coroutine should look familiar, if you’ve ever used the <a href="http://python-requests.org">requests</a> library.</p> <p>At each point where <code class="highlighter-rouge">await</code> is called, the coroutine could theoretically yield control to a different coroutine. However, since there are no other coroutines here, the script behaves roughly like a synchronous program:</p> <ol> <li>Create an HTTP session.</li> <li>Make an HTTP request.</li> <li>Wait for the response headers.</li> <li>Print the response status code.</li> <li>Wait for the response content.</li> <li>Print the content.</li> </ol> <h3 id="achieving-concurrency">Achieving concurrency</h3> <p>To reap the benefits of asynchronous I/O, it’s not enough to sprinkle our programs with the <code class="highlighter-rouge">async</code> and <code class="highlighter-rouge">await</code> keywords. We need to encode which operations can be executed independently (concurrent) and which need to happen one after the other (sequential).</p> <p>Sequential execution:</p> <div class="highlighter-rouge"><pre class="highlight"><code><span class="n">response1</span> <span class="o">=</span> <span class="n">await</span> <span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">'https://foo.com'</span><span class="p">)</span> <span class="n">response2</span> <span class="o">=</span> <span class="n">await</span> <span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">'https://bar.com'</span><span class="p">)</span> </code></pre> </div> <p>Concurrent execution:</p> <div class="highlighter-rouge"><pre class="highlight"><code><span class="n">taks1</span> <span class="o">=</span> <span class="n">await</span> <span class="n">curio</span><span class="o">.</span><span class="n">spawn</span><span class="p">(</span><span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">'https://foo.com'</span><span class="p">))</span> <span class="n">task2</span> <span class="o">=</span> <span class="n">await</span> <span class="n">curio</span><span class="o">.</span><span class="n">spawn</span><span class="p">(</span><span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">'https://bar.com'</span><span class="p">))</span> </code></pre> </div> <p><code class="highlighter-rouge">curio.spawn()</code> is how you express the idea “I want this coroutine to be executed in the background”. The thing that’s spawned is what Curio calls a <em>task</em>.</p> <p>Let’s look at an example that fetches a list of URLs concurrently by spawning a task for each one:</p> <div class="highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">curio</span> <span class="kn">import</span> <span class="nn">curio_http</span> <span class="n">async</span> <span class="k">def</span> <span class="nf">fetch_one</span><span class="p">(</span><span class="n">url</span><span class="p">):</span> <span class="n">async</span> <span class="k">with</span> <span class="n">curio_http</span><span class="o">.</span><span class="n">ClientSession</span><span class="p">()</span> <span class="k">as</span> <span class="n">session</span><span class="p">:</span> <span class="n">response</span> <span class="o">=</span> <span class="n">await</span> <span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span> <span class="n">content</span> <span class="o">=</span> <span class="n">await</span> <span class="n">response</span><span class="o">.</span><span class="n">json</span><span class="p">()</span> <span class="k">return</span> <span class="n">response</span><span class="p">,</span> <span class="n">content</span> <span class="n">async</span> <span class="k">def</span> <span class="nf">main</span><span class="p">(</span><span class="n">url_list</span><span class="p">):</span> <span class="n">tasks</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">url</span> <span class="ow">in</span> <span class="n">url_list</span><span class="p">:</span> <span class="n">task</span> <span class="o">=</span> <span class="n">await</span> <span class="n">curio</span><span class="o">.</span><span class="n">spawn</span><span class="p">(</span><span class="n">fetch_one</span><span class="p">(</span><span class="n">url</span><span class="p">))</span> <span class="n">tasks</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">task</span><span class="p">)</span> <span class="k">for</span> <span class="n">task</span> <span class="ow">in</span> <span class="n">tasks</span><span class="p">:</span> <span class="n">response</span><span class="p">,</span> <span class="n">content</span> <span class="o">=</span> <span class="n">await</span> <span class="n">task</span><span class="o">.</span><span class="n">join</span><span class="p">()</span> <span class="k">print</span><span class="p">(</span><span class="s">'GET </span><span class="si">%</span><span class="s">s'</span> <span class="o">%</span> <span class="n">response</span><span class="o">.</span><span class="n">url</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="n">content</span><span class="p">)</span> <span class="k">print</span><span class="p">()</span> <span class="n">url_list</span> <span class="o">=</span> <span class="p">[</span> <span class="s">'http://httpbin.org/delay/1'</span><span class="p">,</span> <span class="s">'http://httpbin.org/delay/2'</span><span class="p">,</span> <span class="s">'http://httpbin.org/delay/3'</span><span class="p">,</span> <span class="s">'http://httpbin.org/delay/4'</span><span class="p">,</span> <span class="p">]</span> <span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span> <span class="n">curio</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">main</span><span class="p">(</span><span class="n">url_list</span><span class="p">))</span> </code></pre> </div> <p>Each URL in the list takes a number of seconds to fetch.</p> <p>If we were to fetch them sequentially, it would take 1+2+3+4=10 seconds in total.</p> <p>Since we’re using tasks, the run time will only be around 4 seconds.</p> <h3 id="controlling-concurrency">Controlling concurrency</h3> <p>What if we want to scrape a site, but we don’t want to hammer it with too many concurrent connections?</p> <p>The simplest approach is to use what’s called a bounded semaphore.</p> <p>Let’s see what changes we would need to make to the above example:</p> <div class="highlighter-rouge"><pre class="highlight"><code> import curio import curio_http <span class="gi">+MAX_CONNECTIONS_PER_HOST = 2 + +sema = curio.BoundedSemaphore(MAX_CONNECTIONS_PER_HOST) + </span> async def fetch_one(url): <span class="gd">- async with curio_http.ClientSession() as session: </span><span class="gi">+ async with sema, curio_http.ClientSession() as session: </span> response = await session.get(url) content = await response.json() return response, content </code></pre> </div> <p>Here, we’re using not one, but two context managers: the semaphore and the HTTP session.</p> <p>The semaphore is <em>aquired</em> each time a task is started. It’s <em>released</em> right after the URL has finished being fetched.</p> <p>If more than <code class="highlighter-rouge">MAX_CONNECTIONS</code> tasks have already aquired the semaphore, the next task that tries to aquire it will wait until a release happens.</p> <p>To learn about more neat features of Curio, such as timeout handling and events, check out the excellent introductory <a href="http://curio.readthedocs.io/en/latest/tutorial.html">tutorial</a>.</p> <p>I hope this has given you a glimpse of what modern async I/O can look like in Python. All of the libraries used in this tutorial are in a very early state right now, but I think they have a lot of potential.</p> <div class="footnotes"> <ol> <li id="fn:1"> <p>The downside is that it doesn’t work on Windows right now or with older versions of Python. <a href="#fnref:1" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:2"> <p>Under the hood, it leverages this new thing called a <a href="https://sans-io.readthedocs.io/">sans I/O network protocol</a>. <a href="#fnref:2" class="reversefootnote">&#8617;</a></p> </li> </ol> </div> http://scribu.net/blog/3d-maps-using-d3-and-three.js 3D maps using D3 and three.js 2015-02-23T00:00:00+02:00 scribu http://scribu.net <p>Last weekend I participated in the <a href="http://odd15.datedeschise.ro/cluj/">Open Data Day Hackathon</a> in Cluj-Napoca. I was glad to see that there were other people nearby interested in open data.</p> <p>My idea was to make a 3D map of Romania, with each county elevated according to its population. It was really satisfying to go from this:</p> <p><img src="/assets/img/census-data.png" alt="census data" /></p> <p>to this:</p> <p><img src="/assets/img/3d-map.png" alt="3D map" /></p> <p>Here’s a <a href="http://scribu.github.io/romania-3d/">live demo</a> (requires a browser with WebGL support).</p> <p>The main advantage of this visualization mode is that you can see all the counties at once, so you get a holistic sense of population distribution, as well as population density.</p> <h3 id="features">Features</h3> <p>You can select the census year to view data from. If you hover over a county with the mouse (or tap on it on a touchscreen), you can see the county name and population count in the upper-right corner.</p> <p>The map can be tilted by dragging and zoomed by scrolling, so you can get the best view of a particular area.</p> <h3 id="implementation">Implementation</h3> <p>From a <a href="https://github.com/mbostock/topojson">topojson</a> file, the geographic data for each county is projected to a 2D path using <a href="http://d3js.org/">D3</a>. This path is then converted to 3D geometry using <a href="https://github.com/asutherland/d3-threeD">d3-threeD</a> and extruded based on the census data. Finally, the geometry is rendered using <a href="http://threejs.org/">three.js</a>. The approach was inspired by <a href="http://www.smartjava.org/content/render-geographic-information-3d-threejs-and-d3js">Jos Dirksen’s post</a>.</p> <p>One thing I learned is that d3-threeD chokes on shapes with holes in them, so I had to <a href="https://github.com/scribu/romania-3d/commit/e4fa4600cd2e9df8632217f632f0f47248cb8f84">remove them</a>.</p> <p>To detect hovering over a particular county, a raycaster is used to find if there’s an interesection between the cursor position and any of the county meshes.</p> <p>For UI rendering I used <a href="http://facebook.github.io/react/">React</a>, mostly because I wanted to play with it.</p> <p>For more details about the implementation, check out the <a href="https://github.com/scribu/romania-3d/">source code</a>.</p> http://scribu.net/blog/properly-forwarding-email-to-gmail Properly forwarding email to Gmail 2014-11-18T00:00:00+02:00 scribu http://scribu.net <p>So I have a custom email address, <em>mail@example.com</em>, which I want to forward to <em>example@gmail.com</em>.</p> <p>Since I use NameCheap as my registrar, I had the option of setting up email forwarding via their UI. Very easy, but with drawbacks: sometimes email just wouldn’t arrive and I had no way of figuring out what the problem was.</p> <p>Then, while moving my hosting to <a href="https://www.digitalocean.com/?refcode=e785d2574328">DigitalOcean</a>, I decided to do the forwarding on my own server. So I configured the MX records and added virtual aliases in Postfix and pretty soon got a strange error from Google’s SMTP servers:</p> <blockquote> <p>Our system has detected an unusual rate of 421-4.7.0 unsolicited mail originating from your IP address. To protect our 421-4.7.0 users from spam, mail sent from your IP address has been temporarily 421-4.7.0 rate limited. Please visit 421-4.7.0 http://www.google.com/mail/help/bulk_mail.html to review our Bulk 421 4.7.0 Email Senders Guidelines. f14si1032016icj.42 - gsmtp (in reply to end of DATA command))</p> </blockquote> <p>After a bit of Googling, it seems somebody had the <a href="https://www.digitalocean.com/community/questions/temporarily-rate-limited-from-google-in-mail-relay">same problem</a> and one suggested fix went like this: instead of pushing email to Gmail, make Gmail fetch it via POP3, so that’s what I did:</p> <ol> <li>Enabled POP3 access by installing Courier.</li> <li>Created a separate <code class="highlighter-rouge">postmaster</code> user (which doesn’t have <code class="highlighter-rouge">sudo</code>).</li> <li>Routed all incoming mail to the <code class="highlighter-rouge">postmater</code> user.</li> <li>Added the credentials in Gmail (<em>Settings → Accounts and Import</em>).</li> </ol> <p>and it seems to work great. Gmail fetches mail from my server at its own pace, while leaving malicious email in place. I should probably set up a cron job to delete messages left in <code class="highlighter-rouge">Maildir/cur/</code>.</p> <p>Basically, everything you need to know is in this <a href="https://help.ubuntu.com/community/PostfixBasicSetupHowto">excellent tutorial</a> on the Ubuntu Wiki.</p> <p>For a long time, for me Postfix was that thing that never works right, but after reading a few clear tutorials, it actually sort of makes sense. So, drop me a line anytime. :)</p> http://scribu.net/blog/left-wordpress Left WordPress 2014-11-18T00:00:00+02:00 scribu http://scribu.net <p>It seems like most people that follow me on Twitter still think that I’m involved with WordPress. I haven’t done any WordPress-related work in over 6 months and I don’t intend to do any WordPress-related work in the future.</p> <p>I <a href="http://scribu.net/blog/switched-to-jekyll.html">stopped using</a> WordPress at the end of 2012. I stopped contributing to WordPress completely more than a year ago. All my plugins have been <a href="http://scribu.net/wordpress/plugin-help-wanted.html">up for adoption</a> for quite a while. Six months ago, I <a href="http://wp-cli.org/blog/new-maintainer-daniel-bachhuber.html">gave the reign</a> of WP-CLI to Daniel Bachhuber and it seems to be going well. I don’t blog about WordPress anymore and I don’t go to WordCamps either.</p> <p>Ok, but why? A couple of reasons:</p> <p>Firstly, technical aspects: crappy language (PHP, an ancient version to boot) and crappy architecture (everything touches <code class="highlighter-rouge">WP_Query</code>, a <a href="https://en.wikipedia.org/wiki/God_object">god object</a> if I ever saw one).</p> <p>Secondly, slow development process: I think one of the required ingredients for the success of the WordPress platform is its commitment to backwards compatibility. The downside is that WordPress is stuck with its crappy architecture forever and improvements take a lot more effort to land.</p> <p>Lastly, I realized that there’s a whole world out there beyond web content management systems that I wanted to explore.</p> <p>Having said all that, I will miss the WordPress community. It’s hard to leave behind all the friendly faces I’ve spent so much time with online (and sometimes even offline), but it’s something I have to do.</p> <p>Onward!</p> http://scribu.net/blog/high-tech-folly High Tech's Folly 2014-06-28T00:00:00+03:00 scribu http://scribu.net <p>I am afflicted with this disease:</p> <blockquote> <p>In large measure the high casualty rate of knowledge-based industry is the fault of the knowledge-based, and especially the high-tech, entrepreneurs themselves. They tend to be contemptuous of anything that is not “advanced knowledge”, and particularly of anyone who is not a specialist in their own area. They tend to be infatuated by their own technology, often believing that “quality” means what is technically sophisticated rather than what gives value to the user.</p> </blockquote> <p><a href="http://www.amazon.com/gp/product/0060851139/ref=as_li_tl?ie=UTF8&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0060851139&amp;linkCode=as2&amp;tag=scribunet-20&amp;linkId=52W3WVXXBHX7SNFP">Peter F. Drucker - Innovation and Entrepreneurship</a></p> http://scribu.net/blog/travis-ci-build-stats Travis CI Build Statistics 2014-04-10T00:00:00+03:00 scribu http://scribu.net <p>I don’t know about you, but I think <a href="https://travis-ci.org">Travis CI</a> is the best thing that happened to open-source development since Github.</p> <p>I noticed that my builds seemed to be getting slower lately. Looking at the build history in the regular Travis CI interface wasn’t very conclusive, because you can only see 10 or 20 builds at a time:</p> <p><img src="/assets/img/travis-org.png" alt="travis site screenshot" /></p> <p>Even after pressing the “Show more” button repeadetly, I still wasn’t sure that it wasn’t just a random fluctuation. What I needed was to see all the builds at the same time.</p> <p>After a few hours of reading the <a href="https://api.travis-ci.org/docs/">Travis API docs</a> and fiddling with <a href="http://d3js.org/">D3</a>, this is what I came up with:</p> <p><img src="/assets/img/travis-charts.png" alt="travis chart screenshot" /></p> <p>It incrementally loads the most recent ~500 builds and plots them individually and again grouped by day. Clicking on one of the thin bars sends you to that build’s page on travis-ci.org where you can see all the available information about it.</p> <p>Looking at the build durations bar chart on the left, you can see that builds have indeed gotten slower, from 15-20 minutes per build, to about 20-25 minutes.</p> <p>You can inspect the build times for your own projects as well: <a href="http://scribu.github.io/travis-stats/">http://scribu.github.io/travis-stats/</a></p> <p>If you have other neat visualization ideas, open an issue on Github: <a href="https://github.com/scribu/travis-stats">https://github.com/scribu/travis-stats</a></p> http://scribu.net/blog/cross-browser-uploads-ember-moxie Cross-browser AJAX uploads with Ember.js and mOxie 2014-02-24T00:00:00+02:00 scribu http://scribu.net <p>Implementing single-page web applications that work on all browsers remains a challenge. For the basic task of uploading files, you still need some sort of polyfill or library that adds <a href="http://caniuse.com/#search=FileReader">support</a> for older browsers (read IE 8 and 9, which are still in wide use).</p> <p>In this tutorial I’m going to describe how to integrate one such library, called <a href="https://github.com/moxiecode/moxie">mOxie</a>, with one client-side MVC framework, called <a href="http://emberjs.com/">Ember.js</a>.</p> <h3 id="getting-the-moxie-library">0. Getting the mOxie library</h3> <p>I’m going to assume you already have an Ember app going, so the first step is acquiring the mOxie files. You can either use the <a href="https://github.com/moxiecode/moxie/blob/master/bin/">pre-built files</a> or <a href="https://github.com/moxiecode/moxie#build-instructions">compile your own</a>. For example, we won’t need XHR2 support in this tutorial, so we can leave it out.</p> <h3 id="defining-the-template">1. Defining the template</h3> <p>The next thing we have to do is write the Handlebars template that will contain all the UI elements we need:</p> <script src="https://gist.github.com/scribu/e8cc6dcaeddb9df07d27.js?file=template.hbs"></script> <p>The UI has several components:</p> <ul> <li>error and progress notifications</li> <li>the list of selected files</li> <li>the button for selecting more files</li> <li>the button for initiating the upload</li> </ul> <h3 id="initializing-the-file-picker">2. Initializing the file picker</h3> <p>In the template above we placed the button inside a view. We can use that view to convert the <code class="highlighter-rouge">&lt;button&gt;</code> into a file picker:</p> <script src="https://gist.github.com/scribu/e8cc6dcaeddb9df07d27.js?file=view.js"></script> <p>Here we create a <code class="highlighter-rouge">mOxie.FileInput</code> instance once the template containing the button is rendered.</p> <h3 id="addingremoving-files">3. Adding/removing files</h3> <p>The view we defined in the previous step will send events up to the controller, which has to respond to them:</p> <script src="https://gist.github.com/scribu/e8cc6dcaeddb9df07d27.js?file=controller-1.js"></script> <p>The neat thing about Ember.js is that it will automatically re-render the template whenever the <code class="highlighter-rouge">attachments</code> property is modified.</p> <h3 id="uploading-the-files">4. Uploading the files</h3> <p>Finally, when the user wants to submit the form, we have to actually send the files to the server:</p> <script src="https://gist.github.com/scribu/e8cc6dcaeddb9df07d27.js?file=controller-2.js"></script> <p>We start uploading all the files concurrently. When one is done, we increment a counter. When all of them are done, we clear the queue. Did I mention <a href="http://domenic.me/2012/10/14/youre-missing-the-point-of-promises/">promises</a> are great?</p> <p>And here are the helper functions used in the controller above:</p> <script src="https://gist.github.com/scribu/e8cc6dcaeddb9df07d27.js?file=helpers.js"></script> <p>I wrapped both the <code class="highlighter-rouge">mOxie.FileReader</code> process and the AJAX request in RSVP promises so that chaining and utility methods such as <code class="highlighter-rouge">.catch()</code> always work as expected.</p> <h3 id="demo">Demo</h3> <p>I’ve set up a quick <a href="http://scribu.github.io/ember-moxie-demo/">demo</a> so that you can see it in action.</p> <p>This is just a starting point, of course. You can add all sorts of usability enhancements, such as progress bars, image previews etc. Happy hacking!</p> http://scribu.net/plugin-dependencies/new-maintainer-x-team New Maintainer for Plugin Dependencies: X-Team 2014-01-27T00:00:00+02:00 scribu http://scribu.net <p>I handed off development of the <a href="http://wordpress.org/plugins/plugin-dependencies/">Plugin Dependencies</a> plugin to <a href="https://github.com/x-team">X-Team</a>. They have released all sorts of interesting tools; check them out.</p> <p>The official Github repository is now <a href="https://github.com/x-team/wp-plugin-dependencies">https://github.com/x-team/wp-plugin-dependencies</a>.</p> http://scribu.net/blog/the-wind-blows-them-away Quote From "The Little Prince" 2013-11-24T00:00:00+02:00 scribu http://scribu.net <p>“Where are the people?” the little prince asked, politely.</p> <p>The flower had once seen a caravan passing.</p> <p>“People?” she echoed. “I think there are six or seven of them in existence. I saw them, several years ago. But one never knows where to find them. The wind blows them away. They have no roots, and that makes their life very difficult.”</p> http://scribu.net/blog/test-doubles-in-pure-php Creating test doubles in pure PHP 2013-09-20T00:00:00+03:00 scribu http://scribu.net <p>The PHP world is not known for good unit test coverage. It’s mostly a cultural issue, but there is a technical aspect to it as well.</p> <p>PHPUnit allows you to create <a href="http://phpunit.de/manual/3.7/en/test-doubles.html">mock objects</a>, but that assumes your codebase uses the Depedency Injection pattern. If not, it’s very hard to add unit tests without doing major refactorings, because the language doesn’t support monkey patching (i.e. redefining functions and methods at runtime).</p> <p>You could install the <a href="http://php.net/manual/en/book.runkit.php">runkit</a> extension, which allows you to replace everything, including constants. However, that means that anyone who wants to run the tests needs to re-compile their PHP.</p> <blockquote> <p><a href="https://github.com/antecedent/patchwork">Patchwork</a> is a PHP library that makes it possible to redefine user-defined functions and methods at runtime, replicating the functionality of <code class="highlighter-rouge">runkit_function_redefine</code> in pure PHP 5.3 code.</p> </blockquote> <p>I started using it to create mocks and it’s great, but I had this nagging thought: how the hell does it work?</p> <p>I set out to figure out what black magic the author uses, only to find that he already wrote a very easy to understand description of the <a href="http://antecedent.github.io/patchwork/docs/implementation.html">implementation</a>, like any responsible developer would. Neat!</p> <p>Gems like these are few and far between in the PHP ecosystem, but they do exist.</p>