mirror of
				https://github.com/balkian/balkian.github.com.git
				synced 2025-10-30 15:18:17 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			268 lines
		
	
	
		
			25 KiB
		
	
	
	
		
			XML
		
	
	
	
	
	
			
		
		
	
	
			268 lines
		
	
	
		
			25 KiB
		
	
	
	
		
			XML
		
	
	
	
	
	
| <?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Programming on J. Fernando Sánchez</title><link>https://balkian.com/categories/programming/</link><description>Recent content in Programming on J. Fernando Sánchez</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Wed, 26 Feb 2025 23:22:59 +0100</lastBuildDate><atom:link href="https://balkian.com/categories/programming/index.xml" rel="self" type="application/rss+xml"/><item><title>Bridging RDF, JSON-LD and Dataclasses</title><link>https://balkian.com/p/bridging-rdf-json-ld-and-dataclasses/</link><pubDate>Wed, 26 Feb 2025 23:22:59 +0100</pubDate><guid>https://balkian.com/p/bridging-rdf-json-ld-and-dataclasses/</guid><description><p>In the RDF world, data is expressed as a collection of triples.
 | |
| These triples can contain IRIs that may or may not be accessible or valid.
 | |
| And the use of these IRIs may or may not adhere to a vocabulary.
 | |
| Checking the validity of the IRIs and the semantics of the triples is an additional step.</p>
 | |
| <h2 id="the-rdflib-way">The <code>rdflib</code> way
 | |
| </h2><p><code>rdflib</code> only models IRIs, values and namespaces.
 | |
| Developers need to be cognisant of the URIs they are using, and the vocabularies being used.
 | |
| Prior to version 2.0, senpy followed a very similar model.
 | |
| It had a base class to represent a generic node.
 | |
| Each instance then gets its own automatically generated id, and will act like a normal dictionary, whose keys and values will be serialized as a JSON-LD dictionary.
 | |
| Multiple subclasses were also included to model specific types of node, mostly to provide convenience methods for the given subtype.
 | |
| Here is an example of a subclass, <code>Entity</code>.</p>
 | |
| <div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt">1
 | |
| </span><span class="lnt">2
 | |
| </span><span class="lnt">3
 | |
| </span><span class="lnt">4
 | |
| </span><span class="lnt">5
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">entry</span> <span class="o">=</span> <span class="n">Entry</span><span class="p">()</span>
 | |
| </span></span><span class="line"><span class="cl">
 | |
| </span></span><span class="line"><span class="cl"><span class="n">entry</span><span class="p">[</span><span class="s1">&#39;vocab:property&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">25</span>
 | |
| </span></span><span class="line"><span class="cl">
 | |
| </span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">entry</span><span class="o">.</span><span class="n">jsonld</span><span class="p">())</span>
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div><p>Would print something like this:</p>
 | |
| <div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt">1
 | |
| </span><span class="lnt">2
 | |
| </span><span class="lnt">3
 | |
| </span><span class="lnt">4
 | |
| </span><span class="lnt">5
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
 | |
| </span></span><span class="line"><span class="cl"> <span class="nt">&#34;@id&#34;</span><span class="p">:</span> <span class="s2">&#34;:Entry_202505....&#34;</span><span class="p">,</span>
 | |
| </span></span><span class="line"><span class="cl"> <span class="nt">&#34;@type&#34;</span><span class="p">:</span> <span class="s2">&#34;prefix:Entity&#34;</span><span class="p">,</span>
 | |
| </span></span><span class="line"><span class="cl"> <span class="nt">&#34;vocab:property&#34;</span><span class="p">:</span> <span class="mi">25</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="p">}</span>
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div><p>Producing correct triples using this model requires using the vocabularies and URIs properly, with little to no tooling to enforce it.
 | |
| This poses a big problem for a tool like Senpy, which aims to make it easier for professionals without a background in RDF to build and consume semantic NLP ser
 | |
| If an attribute is not a URI and is not included in the global JSON-LD context, it will not generate a triple in the final graph.
 | |
| Moreover, there is way to enforce that the vocabularies and the</p>
 | |
| <p>Pros:</p>
 | |
| <ul>
 | |
| <li>Flexible/extensible</li>
 | |
| <li>Lightweight. This is mostly JSON-LD in Python&rsquo;s clothing.</li>
 | |
| <li>Naturally maps to both <code>rdflib</code> and writing <code>json-ld</code></li>
 | |
| </ul>
 | |
| <p>Cons:</p>
 | |
| <ul>
 | |
| <li>Discoverability. Documentation and examples are needed to know which attributes to use</li>
 | |
| <li>Error-prone. It is easy to misuse a property, or introduce typos</li>
 | |
| <li>Tight coupling with semantics/RDF. One needs to know a thing or two about RDF, especially if new vocabularies or annotations need to be used.</li>
 | |
| </ul>
 | |
| <h2 id="the-object-oriented-way">The object-oriented way
 | |
| </h2><p>An obvious alternative to this problem in an object-oriented language like python is to use classes to represent our data model.
 | |
| These classes can define the specific attributes available, and typing annotations can serve both as a guide for the developer, and as a means to automatically
 | |
| validate objects at runtime.
 | |
| There are tools like <a class="link" href="https://pydantic.dev/" target="_blank" rel="noopener"
 | |
| >pydantic</a> that make this process very simple.
 | |
| Then, we only need to define how your models should be serialized into JSON-LD.
 | |
| We can thoroughly test this serialization to ensure that the resulting object is correct and produces the right RDF graph.
 | |
| Going back to our previous example, we could define an Entry class as a dataclass, and define all the possible types of annotations as attributes.</p>
 | |
| <p>This model works great when all the possible attributes are known ahead of time.
 | |
| But it starts to break when the model provided is not comprehensive enough, or customers of your library need to provide their own ad-hoc annotations / attribut
 | |
| es.
 | |
| This could be solved by encouring consumers of our library to define their own subclasses whenever they need to add new attributes.
 | |
| This works perfectly fine for serialization, but it breaks if your library needs to automatically deserialize these subclasses.
 | |
| It also breaks if different parts of the code need to add their own attributes on the same data at the same time.
 | |
| This was precisely the case of <code>senpy</code>, where entities are annotated by different plugins, each providing a different set of annotations.</p>
 | |
| <p>Pros:</p>
 | |
| <ul>
 | |
| <li>Discoverability. All possible attributes are known ahead of time, including their possible types.</li>
 | |
| <li>Decoupling from RDF. Developers only need to know about the dataclasses provided. The mapping to the RDF world is already encoded in the dataclass.</li>
 | |
| </ul>
 | |
| <p>Cons:</p>
 | |
| <ul>
 | |
| <li>Rigidity. Adding new types of annotations requires modifying the models, in the main module.</li>
 | |
| <li>Polymorphism.</li>
 | |
| </ul>
 | |
| <h2 id="a-hybrid-approach">A hybrid approach
 | |
| </h2><p>Whichever solution is chosen in the end, it needs to:</p>
 | |
| <ul>
 | |
| <li>Make it easy and error-proof to add the most common types of annotations</li>
 | |
| <li>Allow for additional annotations/attributes to be added</li>
 | |
| <li>Allow for upgrades in the future. i.e., converting the most common custom annotations into built-in ones</li>
 | |
| <li>Allow for deserialization of custom types</li>
 | |
| <li>Allow multiple consumers to add their own annotations</li>
 | |
| </ul></description></item><item><title>uv - One rust tool to rule all pythons</title><link>https://balkian.com/p/uv-one-rust-tool-to-rule-all-pythons/</link><pubDate>Mon, 17 Feb 2025 23:02:47 +0100</pubDate><guid>https://balkian.com/p/uv-one-rust-tool-to-rule-all-pythons/</guid><description><img src="https://balkian.com/img/uv.png" alt="Featured image of post uv - One rust tool to rule all pythons" /><p>Long story short: I&rsquo;m now using <a class="link" href="https://github.com/astral-sh/uv" target="_blank" rel="noopener"
 | |
| >uv</a>, and so should you.
 | |
| It is a great replacement for pip, pip-tools, pipx, poetry, pyenv, twine, virtualenv, and more.</p>
 | |
| <h2 id="context">Context
 | |
| </h2><p>For years, my strategy to manage python projects has been a mix of a custom <code>setup.py</code>, several hand-crafted <code>requirements.txt</code> files (through <code>pip freeze</code>), a custom virtualenv per project, and multiple tools to upload to PyPI.
 | |
| Although this works, this setup has many drawbacks:</p>
 | |
| <ul>
 | |
| <li>It requires user intervention (creating a venv, sourcing it, handling new deps). This isn&rsquo;t ideal if you want new (probably inexperienced) users to use your projects.</li>
 | |
| <li>On a similar note, the whole process needs to be well documented if you want other users to contribute or maintain the code.</li>
 | |
| <li>Pinning dependency versions is finicky, and I&rsquo;ve run into problems beause of that.</li>
 | |
| <li>Creating a new project involves a template, or copying files from an older project.</li>
 | |
| </ul>
 | |
| <p>Of course, this is nothing new.
 | |
| There is a whole site dedicated to <a class="link" href="https://packaging.python.org/en/latest/" target="_blank" rel="noopener"
 | |
| >packaging your Python project</a>.
 | |
| A plethora of different projects have come and go, with varying degrees of success.</p>
 | |
| <h2 id="alternatives-poetry">Alternatives (poetry)
 | |
| </h2><p>About a year before trying <code>uv</code>, I tried to catch up with the ecosystem and get to know the <code>blessed new way</code>.
 | |
| However, the task proved to be a little more difficult, as the landscape is filled with a myriad of alternatives, each with their own set of drawbacks and detractors.
 | |
| Packaging has historically been a weak spot, in ironical contradiction to the Zen of Python&rsquo;s &ldquo;There should be one&ndash; and preferably only one &ndash;obvious way to do it&rdquo;,</p>
 | |
| <p>I eventually settled on <a class="link" href="https://python-poetry.org/" target="_blank" rel="noopener"
 | |
| >poetry</a>.
 | |
| Mostly because it seemed like the most popular alternative.</p>
 | |
| <p>There are many things I liked about it.
 | |
| First of all, having a convention for dependencies (<code>pyproject.toml</code>) and a tool that properly handles them was nice.
 | |
| It also removed the need to remember specific incantations to build and publish my Python projects.
 | |
| Lastly, I mixed it <code>poetry2nix</code> to create reproducible python environments using nix.
 | |
| This makes for a very powerful experience.</p>
 | |
| <p>However, there were multiple hiccups.
 | |
| First of all, it took me some time to figure out which specific fields to use (each tool can define ad-hoc properties in a the <code>pyproject.toml</code> file), and some of them seemed redundant with the more generic ones.
 | |
| Full disclosure, this specific point might be a mistake on my side, and I do not remember the details.
 | |
| The second one is speed.
 | |
| (Re-)creating an environment took a non-negligible amount of time.</p>
 | |
| <h2 id="enter-light-uv">Enter <del>light</del> <code>uv</code>
 | |
| </h2><p>According to its repository, <code>uv </code>can replace pip, pip-tools, pipx, poetry, pyenv, twine, virtualenv, and more.
 | |
| Not only that, but it also claims to do that 10-100 times faster than pip.
 | |
| I must admit that it being written in rust was a another selling point for me, as I&rsquo;m looking for excuses to collaborate in a decently-sized rust projejct.</p>
 | |
| <p>Installing it is dead simple: simply download the binary (e.g., with curl) or run <code>pip install uv</code>.
 | |
| You won&rsquo;t need much more: <code>uv</code> seems to just do the right thing out of the box.
 | |
| And it does it really, really fast.
 | |
| The rest of the time it gets out of the way.</p>
 | |
| <p>My only gripe so far is that I don&rsquo;t seem to find a built-in command to drop into a shell, but that is nothing that <code>uv run $SHELL</code> cannot fix.</p>
 | |
| <h2 id="common-operations">Common operations
 | |
| </h2><h3 id="initialize-a-repository">Initialize a repository
 | |
| </h3><div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt">1
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">uv init
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div><h3 id="adding-dependencies">Adding dependencies
 | |
| </h3><div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt">1
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">uv add senpy
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div><h3 id="running-commands-inside-the-environment">Running commands inside the environment
 | |
| </h3><div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt">1
 | |
| </span><span class="lnt">2
 | |
| </span><span class="lnt">3
 | |
| </span><span class="lnt">4
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">uv run &lt;COMMAND&gt;
 | |
| </span></span><span class="line"><span class="cl">
 | |
| </span></span><span class="line"><span class="cl"># e.g., run a shell using your python version and dependencies
 | |
| </span></span><span class="line"><span class="cl">uv run $SHELL
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div><h3 id="dependency-tree">Dependency tree
 | |
| </h3><div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt"> 1
 | |
| </span><span class="lnt"> 2
 | |
| </span><span class="lnt"> 3
 | |
| </span><span class="lnt"> 4
 | |
| </span><span class="lnt"> 5
 | |
| </span><span class="lnt"> 6
 | |
| </span><span class="lnt"> 7
 | |
| </span><span class="lnt"> 8
 | |
| </span><span class="lnt"> 9
 | |
| </span><span class="lnt">10
 | |
| </span><span class="lnt">11
 | |
| </span><span class="lnt">12
 | |
| </span><span class="lnt">13
 | |
| </span><span class="lnt">14
 | |
| </span><span class="lnt">15
 | |
| </span><span class="lnt">16
 | |
| </span><span class="lnt">17
 | |
| </span><span class="lnt">18
 | |
| </span><span class="lnt">19
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">uv shell
 | |
| </span></span><span class="line"><span class="cl">Resolved 44 packages in 1ms
 | |
| </span></span><span class="line"><span class="cl">my-project v0.1.0
 | |
| </span></span><span class="line"><span class="cl">├── fastapi[standard] v0.115.8
 | |
| </span></span><span class="line"><span class="cl">│ ├── pydantic v2.10.6
 | |
| </span></span><span class="line"><span class="cl">│ │ ├── annotated-types v0.7.0
 | |
| </span></span><span class="line"><span class="cl">│ │ ├── pydantic-core v2.27.2
 | |
| </span></span><span class="line"><span class="cl">│ │ │ └── typing-extensions v4.12.2
 | |
| </span></span><span class="line"><span class="cl">│ │ └── typing-extensions v4.12.2
 | |
| </span></span><span class="line"><span class="cl">│ ├── starlette v0.45.3
 | |
| </span></span><span class="line"><span class="cl">│ │ └── anyio v4.8.0
 | |
| </span></span><span class="line"><span class="cl">│ │ ├── exceptiongroup v1.2.2
 | |
| </span></span><span class="line"><span class="cl">│ │ ├── idna v3.10
 | |
| </span></span><span class="line"><span class="cl">│ │ ├── sniffio v1.3.1
 | |
| </span></span><span class="line"><span class="cl">│ │ └── typing-extensions v4.12.2
 | |
| </span></span><span class="line"><span class="cl">│ ├── typing-extensions v4.12.2
 | |
| </span></span><span class="line"><span class="cl">│ ├── email-validator v2.2.0 (extra: standard)
 | |
| </span></span><span class="line"><span class="cl">│ │ ├── dnspython v2.7.0
 | |
| </span></span><span class="line"><span class="cl">...
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div></description></item><item><title>Python</title><link>https://balkian.com/cheatsheet/python/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://balkian.com/cheatsheet/python/</guid><description><img src="https://balkian.com/img/python.png" alt="Featured image of post Python" /><h2 id="interesting-libraries">Interesting libraries
 | |
| </h2><h3 id="tqdm"><a class="link" href="https://github.com/tqdm/tqdm" target="_blank" rel="noopener"
 | |
| >TQDM</a>
 | |
| </h3><p>From tqdm&rsquo;s github repository:</p>
 | |
| <blockquote>
 | |
| <p>tqdm means &ldquo;progress&rdquo; in Arabic (taqadum, تقدّم) and an abbreviation for &ldquo;I love you so much&rdquo; in Spanish (te quiero demasiado).</p></blockquote>
 | |
| <p><img src="https://raw.githubusercontent.com/tqdm/tqdm/master/images/tqdm.gif"
 | |
| loading="lazy"
 | |
| alt="TQDM in action"
 | |
| ></p>
 | |
| <h2 id="tools">Tools
 | |
| </h2><h3 id="uv"><a class="link" href="https://github.com/astral-sh/uv" target="_blank" rel="noopener"
 | |
| >uv</a>
 | |
| </h3><p>🚀 A single tool to replace pip, pip-tools, pipx, poetry, pyenv, twine, virtualenv, and more.
 | |
| ⚡️ 10-100x faster than pip.</p>
 | |
| <ul>
 | |
| <li>Provides comprehensive project management, with a universal lockfile.</li>
 | |
| <li>Runs scripts, with support for inline dependency metadata.</li>
 | |
| <li>Installs and manages Python versions.</li>
 | |
| <li>Runs and installs tools published as Python packages.</li>
 | |
| <li>Includes a pip-compatible interface for a performance boost with a familiar CLI.</li>
 | |
| <li>Supports Cargo-style workspaces for scalable projects.</li>
 | |
| <li>Disk-space efficient, with a global cache for dependency deduplication.</li>
 | |
| <li>Installable without Rust or Python via curl or pip.</li>
 | |
| <li>Supports macOS, Linux, and Windows.</li>
 | |
| </ul>
 | |
| <h3 id="pipdeptree"><a class="link" href="https://pypi.org/project/pipdeptree/" target="_blank" rel="noopener"
 | |
| >pipdeptree</a>
 | |
| </h3><p>A tool to generate a dependency tree from a virtualenv.</p>
 | |
| <p>Useful to generate a clean <code>requirements.txt</code> or to clean up one that was generated with <code>pip freeze</code>.
 | |
| Usage:</p>
 | |
| <div class="highlight"><div class="chroma">
 | |
| <table class="lntable"><tr><td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code><span class="lnt">1
 | |
| </span><span class="lnt">2
 | |
| </span><span class="lnt">3
 | |
| </span><span class="lnt">4
 | |
| </span><span class="lnt">5
 | |
| </span><span class="lnt">6
 | |
| </span><span class="lnt">7
 | |
| </span></code></pre></td>
 | |
| <td class="lntd">
 | |
| <pre tabindex="0" class="chroma"><code class="language-gdscript3" data-lang="gdscript3"><span class="line"><span class="cl"><span class="o">$</span> <span class="n">pipdeptree</span> <span class="o">--</span><span class="n">exclude</span> <span class="n">pip</span><span class="p">,</span><span class="n">pipdeptree</span><span class="p">,</span><span class="n">setuptools</span><span class="p">,</span><span class="n">wheel</span> <span class="o">--</span><span class="n">warn</span> <span class="n">silence</span> <span class="o">|</span> <span class="n">grep</span> <span class="o">-</span><span class="n">E</span> <span class="s1">&#39;^\w+&#39;</span> <span class="o">|</span> <span class="n">tee</span> <span class="n">requirements</span><span class="o">-</span><span class="n">clean</span><span class="o">.</span><span class="n">txt</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="n">Flask</span><span class="o">==</span><span class="mf">0.10</span><span class="o">.</span><span class="mi">1</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="n">gnureadline</span><span class="o">==</span><span class="mf">8.0</span><span class="o">.</span><span class="mi">0</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="n">Lookupy</span><span class="o">==</span><span class="mf">0.1</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="n">pipdeptree</span><span class="o">==</span><span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">b1</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="n">setuptools</span><span class="o">==</span><span class="mf">47.1</span><span class="o">.</span><span class="mi">1</span>
 | |
| </span></span><span class="line"><span class="cl"><span class="n">wheel</span><span class="o">==</span><span class="mf">0.34</span><span class="o">.</span><span class="mi">2</span>
 | |
| </span></span></code></pre></td></tr></table>
 | |
| </div>
 | |
| </div></description></item></channel></rss> |