Note: this list might change
A crucial element of architecting a software application for scale is the collaboration of domain experts and developers. For that to happen, the application must separate the domain layer— where elements that represent the real world reside—from the infrastructure layer—where these elements are translated into precise software processes.
Within the Fintech team at Kiwi.com, we are rearchitecting a critical service to accept more payment providers. As part of this refactor, we are adopting the Unit of Work pattern to disentangle domain entities from the database processes that represent them. This way, domain experts can share their knowledge with developers more easily, and developers can find opportunities for optimization without the involvement of domain experts in the process.
Attendees will gain a solid understanding of how to implement the UoW pattern in their Python applications, how it fits into the broader context of DDD, and how to prepare their code for future growth.
What if you don't want a Cat to be an Animal? What is the Liskov Substitution Principle? And what on earth is contravariance?
Discover the answers to these questions and more, as we explore the foundations of generic types in Python. And by the end, you might even understand the weirder errors that Mypy sometimes throws your way.
This talk introduces the concept of vector data cubes - multi-dimensional arrays where at least one dimension is composed of vector geometries - and its implementation in Python within a new library Xvec, built on top of Xarray, Shapely 2.0 and GeoPandas.
Come join this session to check out how Visual Studio Code along with GitHub, Codespaces, and Copilot can significantly improve the data science workflow and take your productivity to the next level. In this talk we will walk through several common Python data science scenarios, showcasing all the productive tooling VS Code has to offer along the way. As a sneak peek, we will be demoing a best-in-class Jupyter Notebooks experience with VS Code Notebooks, a revolutionary new data cleaning / preparation experience with Data Wrangler in VS Code, Copilot that helps you write code and fix issues faster, and more!
This tutorial provides a hands-on introduction to LocalStack - the leading platform to develop and test cloud applications entirely on your local machine!
LocalStack provides a set of 70+ AWS services, running in a local Docker container. The hugely popular open source project (46k+ Github stars, 130+ million downloads) is today considered a “must-have” in the toolbox of every AWS cloud developer around the globe.
Outline: (1) Intro to AWS cloud development with Python (2) Developing Python cloud apps with LocalStack (3) Advanced integrations for IaC and CI/CD pipelines (4) Python internals & advanced features in LocalStack (5) Summary and wrap-up
This interactive session covers live coding to showcase common use cases, settings for local debugging of Lambdas and containerized apps, as well as advanced features that can radically improve team collaboration. We'll also glance over the large ecosystem of tools & integrations - including Terraform, Pulumi, CDK, Serverless.
In this tutorial, you will learn about the various Python modules for processing geospatial data, including GDAL, Rasterio, Pyproj, Shapely, Folium, Fiona, OSMnx, Libpysal, Geopandas, Pydeck, Whitebox, ESDA, and Leaflet. You will gain hands-on experience working with real-world geospatial data and learn how to perform tasks such as reading and writing spatial data, reprojecting data, performing spatial analyses, and creating interactive maps. This tutorial is suitable for beginners as well as intermediate Python users who want to expand their knowledge in the field of geospatial data processing.
In a hands-on workshop I'll introduce some of the tools and methods I have developed to improve documentation consistently and effectively, at scale - by a thousand people or more, working on a hundred or so software products and other projects.
Event-Driven Architectures (EDAs) target a real need in today's application landscape, as systems grow more complex or need to scale organically.
The talk will introduce the architecture and provide insights into different components which can be managed, connected and implemented with Python.
This talk discusses asyncio, an essential tool for asynchronous, non-blocking code in Python, and its limitations, such as non-durability and inability to distribute across multiple machines. Temporal.io, an open-source microservice orchestration platform, is introduced as a robust solution capable of using event sourcing for durability, scalability, and resilience, effectively managing system failures. Temporal based asyncio event loop implementation adds seamless durability to Python code. The complete state of the program, including local variables and await calls, is fully preserved across process and other infra failures. We highlight real-world applications and conclude by emphasizing how Temporal transforms the design and implementation of distributed fault-tolerant systems.
Must-have tools for running GraphQL in production
Gear up for a groundbreaking transformation of your GraphQL prowess! Join us for an engaging and informative session as we unveil a set of indispensable tools and practices that will take your GraphQL APIs to new heights in production environments.
Make your coding meaningful to non-technical recipients! Write your code in Jupyter Notebook, add widgets with the Mercury framework, and easily turn your notebook into an interactive web app. Or.. create a dashboard, a report, and DEPLOY it.
What's this thing called MLOps? You may have heard about it by now, but never really understood what all the fuzz is about. Let's find out together!
In this tutorial, you will learn about MLOps and take your first steps in a hands-on way. To do so, we will be using Open Source tooling. We will be taking a simple example of Machine Learning use case and will gradually make it more ready for production 🚀.
We start with a simple time-series model in Python using scikit-learn and first add logging steps to make the performance of the model measurable. Don't worry: we will go through it step-by-step, so you won't be overwhelmed. Then, we will log our ML model and load it back into an inference step. Lastly, we will learn about deploying these actual models by Dockerizing our application 🙏.
What-if analysis is the key to exploring datasets and assessing outcomes by gradually varying input parameters. It is a vital tool for users in the realm of data analysis and decision-making. However, implementing what-if analysis can be challenging. Join this captivating talk as we delve into the practical implementation of what-if analysis using Taipy.
code.kiwi.com community has been running Python Weekend — an educational community project — since 2016. Over the past 7 years we helped hundreds of Python developers complete the program accelerating their careers, traveled to 10+ cities all over Europe and collaborated with numerous local Python communities to make it happen.
At first glance, Python Weekend is a 2.5-day supervised coding event for Junior+ Python devs, where participants build the prototype of core Kiwi.com technology, while getting support from a group of experienced engineers, for free. But there’s so much more to this.
At this poster session, we will share how to run an educational Python project so the community, business and local tech scene would benefit. We will show how to shape the culture of connecting real-business challenges and junior talent. We will see how dev edu projects can impact the culture of mentorship and present feedback of mentees from all over the world to share their experience.
Proteins are essential components of our bodies, with their function often dependent on their 3D structure. However, uncovering the 3D structure has for a long time been redeemed by months of hard work in the lab. Recent advances in Machine learning and Natural language processing have made it possible to build models (eg. AlphaFold) capable of predicting the protein's 3D structure with the same precision as experimental methods.
In this talk, I will explore an even more specific application of language models for proteins - the detection of a knot in a protein's 3D structure solely from the protein amino acid sequence. Knotting in proteins is a phenomenon that can affect their function and stability. Thanks to NLP and interpretation techniques we can try to uncover why and how proteins tie themself into a knot. In this research, we rely on many Python-based tools starting from Biopython to Pymol and Hugging Face transformer library.
eBPF is a amazing technology that can run sandboxed programs in a privileged context such as the operating system kernel. But are eBPF programs limited to the operating system kernel? eBPF programs have fast access to resources like memory. These programs can access the memory of running Python applications very faster, allowing you to instrument Python processes with low overhead!
In my presentation, I will show how Python's internal structure supports instrumentation through the use of eBPF. Following that, we'll experiment with eBPF and other modern techniques to instrumenting the Python applications. I'll explain explain why eBPF is more appropriate and efficient technology for instrumentation. By the end of the session, we will have developed an eBPF-based simple tracing tool for instrumenting Python applications.
After this presentation, you will better understand how eBPF can help you in the instrumentation of Python applications.
Descriptors are not black magic and this practical tutorial will show you that. In fact, you use descriptors every day and you don't even know it!
Through a series of practical, hands-on exercises, you will learn
- how to create a descriptor;
- how to use the dunder methods
- what the descriptor protocol is; and
- where descriptors show up in day-to-day Python code.
Python running on the browser is the new frontier to creating true client-side web and mobile applications. Today we can many incredible things that were not possible just a few months ago before WASM, Pyodide and PyScript.
The talk will cover what's possible today, cover the major features offered by PyScript and walk through creating amazing applications and games with Python, on the browser, without the need for Python server-side logic.
Join our hands-on workshop and discover how to build fast, production-ready APIs using Robyn, a developer-friendly web framework for Python. We'll guide you through key features like GraphQL, WebSockets, and data validation, as well as essential topics like app structure, database modeling, and code splitting. With our workshop, you'll gain practical experience and valuable insights into Robyn's simple and extensible API, middleware, and deployment process.
This workshop will teach you how to create your first GraphQL API using Python and Strawberry. We will be using using Fastapi as our framework of choice, but most of the concept will be applicable to other frameworks too.
We'll learn how GraphQL works under the hood, and how we can leverage type hints to create end to end type safe GraphQL queries.
To run this workshop you need to have at least python 3.8 and have followed the installation guide here: https://github.com/patrick91/strawberry-tiny-workshop
In this talk proposal, we will discuss how to detect the chain of fraudulent transactions and help the investigation agencies by providing useful insights to fight money laundering with the help of Python programming language and packages.
Whisper AI, a model from OpenAI, has been largely overlooked despite its impressive ability to accurately transcribe and translate human speech from audio.
In this talk I will explore the architecture of the model and explain why it works so well. Additionally, I will live demo the model's capabilities in three languages, showing how you can use it on your own computer to generate English subtitles for a wide range of content.
Don’t you love pickles? In the data science space, the pickle module has become one of the most popular ways to serialise and distribute machine learning models - yet, pickles introduce a wide range of problems. For starters, it is incredibly easy to poison a pickle. Once this happens, a poisoned pickle can be used by an attacker to inject any arbitrary code into your ML pipelines. And what’s even worse: it’s incredibly hard to detect if a pickle has been poisoned!
Good news? Help is on the way! You now have access to an increasing number of tools to help you generate higher-quality pickles. And when those are not enough, you can always draw inspiration from the DevOps movement and their trust-or-discard processes.
This talk will show you how widespread pickles are and how easy it is to poison models serialised with pickle, but also how easy it is to start protecting them from attacks.
As Bloomberg’s infrastructure grows and evolves, the tools we use to manage it are becoming increasingly important. To streamline infrastructure management, our team set out to design a REST API and constituent CLI (Command Line Interface) that would comprise a single interface for both programmatic and human interaction with our infrastructure. Traditionally, building a CLI that is tightly coupled to an API requires maintaining a separate codebase, which is tedious and error-prone. Instead, we designed a CLI that dynamically generates commands based on the OpenAPI JSON documentation. However, since APIs are designed for computer interaction, we designed our API to include the information needed to implement a human-friendly CLI. Leveraging Python, FastAPI, and numerous other open source projects, we built a stable, extensible tool that greatly improves how we interact with our infrastructure.
Dirty Equals is a new python library by Samuel Colvin, the creator of Pydantic. It will transform how you write tests, especially for APIs.
I made some contributions to it, which forever changed how I thought about
NotImplemented. I thought it was a placeholder for unfinished work and unexpected use cases. I thought the language quirks it created in equality comparison were annoying.
But in DirtyEquals, it’s a magic way to transform Python’s built in equality operator... And that changed how I think about language quirks, full stop.
Apache Airflow is an Open Source workflow orchestrator. It is a python library that allows you to automate complex code and integrate it with a plethora of Data Sources. It is provided with an integrated UI and API for both your human and programmatic needs.
After 5 years of running Airflow in production, I hope to share some insights on the technology. The strengths and weaknesses, recommended features and more dangerous ones, and similar considerations on the UI.
I'll also be talking about how you can make your own Operators in Airflow.
Come take a deeper dive into the same solution used by Airbnb, Slack, Walmart and many more to efficiently run their data pipelines.
From recommendation systems to LLM-based applications, vector search is a critical component of the modern AI workflow. Existing vector solutions are complicated to use, hard to maintain, and cost too much. LanceDB is a free open-source vector store that can perform low latency vector search on billion-scale vector datasets on a single node.
Most introductory Python books and online resources like w3schools.com try to be complete when a new concept is explained. This does not always work well for beginners. E.g. if you have just grasped how a while-loop works, it may cause too much cognitive load to also understand the break and continue options, let alone the else clause. The learning psychologist Jerome Bruner introduced the term "spiral learning". The idea is that you don't teach all aspects of a new concept, but just enough to use it. At a later stage a teacher can revisit the subject and explain more details, when a student needs this to take the next step. Spiral Python is a road map of subjects that can be found in any introductory book or online resource about Python, but absolutely original in the sense that it takes into account how people learn in a natural way. You do not need to know the whole language before you can use it. Spiral Python also contains exercises (to practice) and challenges (to motivate).
How to bring large legal document repositories into the public domain without releasing private data? The fundamental concepts behind document anonymization are entity recognition, masking type, and pseudoanonymization. Using python language and a collection of libraries such as spacy, pytorch, and others we can achieve good scores of anonymization. How is this applied within a flow containing AI models for NER? Once anonymized how to improve the result by doing more text mining with python based apps and human in the loop. Although it was approved in 2016, the application of the GDPR at the European level remains a challenge in banking, legal, and other contexts. This talk covers the process of transforming pdf and docx documents into xml, processing them using regexp and spacy/torch models, and how to parse these results using AntConc and Textacy. All the ideas will be supported with the real experience of the MAPA project a European project for anonymization finished in 2022.
How can you scale Python to run at petabyte scale, with the reliability needed to trade billions of dollars? With ArcticDB we have been doing exactly that for the last four years, by leveraging interoperability between Python and high-performance C++, with a detailed understanding of the data structures inside Python and a few extra tricks up our sleeves.
Come take a peek under Python's bonnet and learn how to hotwire a few things along the way.
logging module is a really powerful tool for troubleshooting with a lot of potential to save us hours of debugging.
The aim for the talk is to provide an overview how the logging module in python works, how Django uses it and how to improve our logging to make it better for our web project.
PEP 458 uses cryptographic signing on PyPI to protect Python packages against attackers. The implementation of the PEP inspired the Repository Service for TUF (RSTUF), a project accepted into the OpenSSF sandbox. We identified that the design could benefit other organizations and repositories looking to secure their software supply chains. In this talk we would answer the following questions:
- How did the PEP 458 design help to start the Repository Service for TUF (RSTUF)?
- How could RSTUF be used for PyPI with its millions of packages?
- How can RSTUF be deployed by any organization at any scale without requiring TUF expertise?
Additionally, in this talk, we would give an overview of PEP 458, how it works, and give a high-level overview of TUF.
Python offers us the ability to customize how it starts up. In some cases arbitrary python code can get executed before the first line of your module is reached. This is necessary for some of its dynamic nature, like virtualenvs but can also be harnessed to make the interpreter experience truly personal.
All major companies have huge amounts of (mostly PDF) documents that contain important - even critically important - information, that does no longer exist anywhere else in their data stores.
Reports, once generated for shareholders and legal or financial authorities, may still be useful for developing longterm forecasts or triggering company management decisions.
By definition, documents are intended for human perception, and as such contain unstructured data from an information technology perspective.
Therefore, tools to extract PDF text content (mostly, but not only text) from millions of pages have become important vehicles to recreate structured information.
This presentation talks about extraction "need for speed" in this Big Data scenario, the need for integration with OCR capabilities and presents an open-source toolset which combines both, top-of-the-class performance and maximum extraction detail.
Python tools like Bokeh and Dash let you build custom Web-based interactive visualization apps and dashboards. While these solutions work well to visualize megabyte-sized datasets, web technologies struggle to render gigabyte or larger datasets efficiently, because they transfer all the data into the client browser. Pre-rendering the data on the server using a tool like Datashader can visualize such large datasets efficiently, but the resulting static renderings make exploring individual datapoints difficult.
This talk demonstrates how the HoloViz ecosystem of tools (holoviz.org) allows you to run exploratory notebooks and build dashboards that do server-side rendering of billions of data points without losing the ability to interactively inspect and annotate individual samples in the browser.
Testing is a crucial part of the software development process. But, with so many testing techniques available, it can be challenging to know which one to use. While unit testing is a popular technique, it's not always the most effective or efficient way to ensure software quality. In this talk, we’ll explore spec-based testing, a technique that focuses on verifying that the software behaves in accordance with its specifications or requirements.
The Internet of Things has been flourishing for many years, and Python has been playing an important role on the “easy to automate” topic for many devices. One of the challenges for the next generation ML is to think small, you read that right “thinking small”. It’s time to start being able to have mechanisms with super well-trained ML models in small-devices: ML on Microcontrollers. We are going to dive into TinyML and evaluate different setups to interact with sensors on microcontrollers. We will discuss the different hardware options and frameworks to start with, while checking different use cases that TinyML can solve, like: agriculture, conservation, health issues detection, ecology monitoring etc. In this talk, you will learn about Tiny Machine Learning (TinyML), which is an approach that explores machine learning to be deployed in embedded systems that enable run ML on microcontrollers. Lastly, we will discuss real use-cases and a practical case that could be implemented at home
I will introduce the concept of data unit tests and why they are important in the workflow of data scientists when building data products. In this talk, you will learn a new tool you can use to ensure the quality of the products you build.
This talk is meant to be an hitchhickers guide to Dungeons and Dragons (
D&D) for programmers.
We will leverage on our wit and intelligence to explore a very perilious dungeon 🧙 , where a venomous dragon is hiding in the shadows 🐉 .
Thanks to a magical potion in an ancient flask, our wizardly skills have been enhanced with Pythonic capabilities 🐍 making us the most powerful and geeky magician of the realm.
These new acquired power revealed unprecedented strategies (i.e. algorithms 🙃) that will guide us through the maze avoiding all the traps and pitfalls ⚔️, and will help us maximising the power of our fire magic ☄️ to finally slay the dragon.
If you would like to know more about this new Pythonic spell, and the secrets it unveiled, or if you're simply interested in new graph algorithms that can run balzingly fast maximising your CPU capabilities, this is the talk for you!
Is your FastAPI really fast? Did you benchmark it, or you just have faith?
On this talk, Marcelo will give tips to improve the performance of your FastAPI application, and you’ll see how impactful those changes can be.
Pytorch is one of the most popular machine learning frameworks, and its latest iteration (PyTorch 2.0) landed just a couple of days back. Among other things, PyTorch 2.0 offers faster performance with a fully backward-compatible API that guarantees the development ergonomics that PyTorch is known for.
In this talk, we will examine how practitioners (researchers and engineers) can benefit from optimizations provided by PyTorch 2.0 and what other improvements are on the horizon.
Are you tired of struggling with memory management in Python? Do you want to take your skills to the next level and achieve maximum performance while minimising memory usage? Look no further, here is Zero-Copy in Python! Zero-copy is a technique in computer programming that allows data to be transferred between different parts of a program without being copied to intermediate buffers. In Python, this technique can be achieved using the memory view object, which provides a view into the memory of other objects. Learn how to efficiently manipulate large datasets and optimise your code with the help of this powerful tool. Whether you're working with sockets, objects or memory profiling, memory view is your key to faster and more efficient Python programming.
Black, Flake8, isort, and Mypy are useful Python linters but it’s challenging to use them effectively at scale in the case of multiple codebases, in a large codebase, or with many developers. Linter analysis on large codebases is slow. Linters may slow down developers by asking them to fix trivial issues. Running linters in distributed CI jobs makes it hard to understand the overall developer experience.
In this talk, we'll walk you through solving those scaling problems using a reusable linter framework that releases new linter updates automatically, reuses consistent configurations, runs linters on only updated code to speedup runtime, collects logs and metrics to provide observability, and builds auto fixes for common linter issues. Our linter runs are fast and scalable. Every week, they run 10k times on multiple millions of lines of code in over 25 codebases, generating 25k suggestions for more than 200 developers. Its autofixes also save 20 hours of developer time every week.
In other words, Descriptors + PEP-362 (function signature object) and a seasoning of PEP-487 (simpler customization of class creation via
There are different ways to have generated methods and attributes attached to all classes in a library, and this talk presents the way we’re doing it in scikit-learn. Here you’ll understand the use-case, and see the details and challenges presented by it, and how we approached them.
Pyodide is a port of CPython to WebAssembly/Emscripten enabling Python packages to run directly in the browser or Node.js. We will provide an overview of Pyodide's architecture, capabilities, and potential use cases before looking into building, running, and testing Python packages for the browser.
We will also discuss how browser-specific optimizations, such as code splitting, tree shaking, and lazy loading could be adapted to Python to reduce package size and load time.
Finally, we will mention some of the common restrictions of the browser runtime and how they can be overcome in Python packages.
At the Netherlands Forensic Institute (NFI), we've developed a Python-based deep learning model to spot life-threatening messages in lawfully intercepted communication data, like those from the infamous chat service Encrochat.
Thanks to the application of our model in collaboration with the Dutch Police, dozens of potential victims of violent crimes, including murder, serious assault, and kidnapping, have been warned and safeguarded. In this talk, we'll dive into the development, implementation, and success of our deep learning model in the fight against violent criminal activities. We'll also tackle the risks tied to using deep learning for these cases and discuss the precautions we took to ensure responsible and accurate use.
Everybody loves f-strings in Python. But what if they could be even better? Thanks to PEP 701, Python 3.12 will ship with an improved version of f-strings that will once and for all fix the little remaining problems that f-strings have had, while also supercharging them with new cool powers. In this talk, you will discover the dark little secrets of how f-strings were being processed before Python 3.12 and the many things that didn't work and you didn't know about. You will learn how we changed thousands of lines of manually written C code without anybody noticing, how we changed the oldest part of CPython so quotes behave like parentheses, and how we taught the PEG parser to understand f-strings. Plus, you'll gain an understanding of how these new and improved capabilities will provide several advantages for both end-users and library developers, while also reducing the maintenance cost of the CPython implementation.
Pendulum is a Python package for working with dates, times, and timezones. It offers a simple and intuitive API for common date/time operations and provides advanced functionality for dealing with more complex scenarios. Some of the interesting points of Pendulum include support for leap years, time zones, and daylight saving time, as well as a fluent API for creating and modifying dates and times.
One of the standout features of Pendulum is its support for time zones. The library comes with a comprehensive list of time zones, and it can automatically adjust dates and times to the local time zone of a given location. Additionally, Pendulum can handle time zone conversions with ease, making it easy to work with date/time data across different time zones.
Pendulum also provides a powerful API for creating and modifying dates and times. With its fluent interface, developers can create and manipulate dates and times using a natural, human-readable syntax.
HTTPX is a fully featured HTTP client for Python 3, which provides sync and async APIs, and support for both HTTP/1.1 and HTTP/2. It also includes a built-in command-line client.
We'll be taking a look at the architecture of the client, learning from the design decisions behind it, and gaining a better understanding of HTTP along the way.
Apache Arrow, and its Python library PyArrow are becoming the standard de facto for transfering data and interoperability between libraries and languages. As more compute engines, storages and databases start to speak arrow, you might be relying on it without even knowing. The same transformation is happening with Substrait, that is on track to be the standard representation of query plans themselves. Allowing queries to be routed to different engines as far as they speak substrait, or even decomposed and forwarded to different engines. This talk we will provide a quick introduction to the Arrow ecosystem, showing to Python developers how libraries like Pandas, Polars and PyArrow itself leverage Arrow and how compute engines like Velox, Datafusion and Acero are embracing Arrow and Substrait. The talk will also show how a basic database system based on Arrow and Substrait can be built with a minimum amount of code thanks to all the foundations they provide.
What is the European digital identity? How can you access digital public services from another EU country? Why is it so hard to create an European ecosystem of digital services? Does the EU support open source?
This (opinionated) talk will present the current State of the digital services in the EU. Will summarize the normative and technical challenges, and their impacts on the resulting platforms in terms of UX, cybersecurity and maintainability.
Django and Python make fullstack and API web projects a breeze. But as Python has matured, significant tooling has risen to improve the development experience (DX). Can you use this tooling, in a modern editor and IDE, to stay in the flow and make your development…joyful?
In this session we’ll put PyCharm to work, at helping us work. Navigation, refactoring, autocomplete – the usual suspects. We’ll also see “test-first” development to stay in the IDE, plus how this can apply to frontends.
Beautiful is better than ugly.
The frontier of AI Language Models awaits exploration.
We, Pythonistas, face choices on how to use these tools.
Advanced models like GPT-4, BARD, and LLaMa generate human-like responses.
The nature of Language Models is fear,
But tools like TransformerLens show The Way.
Understanding The Model is possible.
The nature of Language Models is excitement.
Using them out of the box is one option.
Prompt engineering is another.
ChatGPT plugins and LangChain offer a third choice.
Fine-tuning them presents a fourth.
Training them from scratch is the fifth option.
Not using them at all is the final option. It may be safer.
The output for one LM is the prompt for another.
While openai is an excellent library, and
LangChain composes language models and utilities.
GPT's plugin system also composes language models and utilities, and
There should be one-- and preferably only one --obvious way to do it.
- Are prices of short-term rental apartments in your region similar? How similar are they, and at which distance do they tend to be correlated?
- Do you have access to a few air pollution measurements but must provide a smooth map over the whole area?
- Is your machine learning model based on remote sensing data from Earth Observation satellites, and do you want to include data sampled on Earth?
- Do you work with county-level socio-economic factors, but you want to get insights at a finer scale?
if any(answer), then come and see what we can do with the
pyinterpolate package designed exactly for spatial interpolation!
The emergence of ChatGPT has led to an exponential growth of prospects and implementations in the field of Natural Language Processing (NLP). Various teams were struck with FOMO (Fear of Missing Out) and hastened to incorporate Large Language Models (LLMs) into their products. By using OpenAI models (text-curie-001, davinci, gpt-3.5-turbo), we successfully integrated them into our production on March 2, granting our users the ability to receive text summaries in their email reports and comprehend the essence of any article within our application. Three weeks later, we trained our own large language model for the same purpose. This talk will delve into our journey, exploring the lessons and insights gleaned from our hands-on experience with these cutting-edge tools.
Python 3.11 is considerably faster than 3.10. How did we do that? And how are we going to make 3.12 and following releases even faster?
In this talk, I will present a high level overview of the approach we are taking to speeding up CPython. Starting with a simple overview of some basic principles, I will show how we can apply those to streamline and speedup CPython. I will try to avoid computer science and software engineering terminology, in favor of diagrams, a few simple examples, and some high-school math. Finally, I make some estimates about how much faster the next few releases of CPython will be, and how much faster Python could go.
Apache Kafka® is the de facto standard in the data streaming world for sending messages from multiple producers to multiple consumers, in a fast, reliable and scalable manner.
Come and learn the basic concepts and how to use it, by modelling a traditional British fish and chips shop!
Behaviour-driven development promises evergreen documentation or human-readable executable specification - sounds great. However, adopting it takes much more than simply installing behave or pytest-bdd and writing Gherkin. This talk will show what.
Do you feel like digging through github code to learn how to use it is painful? Also think simply packaging and publishing your library to the world on pypi sometimes isn't enough to help others use what you are working on? Then come join me, as this talks is definitely for you!
In this presentation, I'd like to present you typer, and why it's probably the easiest and most affordable way to create command line applications (in 2023) that your users will love to use. We'll discuss it's key strong points, how to structure your CLI application, and make it ready to be packaged and published with no hussle.
SQLAlchemy is one of the most popular ORM libraries in Python. In this talk I will try to present caveats and gotchas that other Pythonists can find on their way while writing the asynchronous backend application using SQLAlchemy as an ORM. Mainly we will focus on how SQLAlchemy handles transactions and connections to the database and what issues we may face because of it.
Systems built with microservices tend to become complex over time. There are several approaches that encapsulate complex distributed system layouts with an API Gateway, or backends for frontends. Having a GraphQL gateway is one of the available options. This method of delivering client-facing APIs has become the standard with modern single-page applications.
Take your Continuous Integration to the next level! Learn how to optimize your pipelines for faster and more efficient builds through parallelization, caching, failing early, conditional runs, and more.
With the risk of losing access to information, Python has been used to create means for society to continue having the right to know what government officials are doing in Brazil.
This lecture aims to show how the difficulty of accessing Brazilian government information has been combated by creating tools that use Python and how the language has been a useful tool for those who seek to leave society in the light of information.
Have you heard about Polars? What are the differences? Is Polars replacing Pandas? In this talk, we are going to demystify these questions about Polars. Compares the differences between Polars and Pandas, and explains the pros and cons of both of them.
pip install malware: it’s that easy. Almost all projects depend on external packages, but did you know how easy it can be to install something nasty instead of the dependency you want? I'll be showing this live, as I make malware and install it from PyPI onto my own computer during the talk!
Python has proven itself to be a powerful tool for data science, and for web servers. However, one area where it hasn't historically been popular is in building applications for end users.
In this talk, you'll discover how you can use Briefcase to distribute an app to users on desktop, mobile, and the web - all from a single Python codebase.
Abstract: Data validation is a critical component of any software application, ensuring that the data processed by the application is accurate and consistent. However, data validation can often be a tedious and error-prone process, especially when dealing with complex data structures. Pydantic, a powerful and flexible data validation library for Python, simplifies the process of data validation by providing a declarative syntax that is easy to read and write.
The advancements of artificial intelligence in computer vision and natural language processing often make the headlines, but the subspace of musical AI is developing just as rapidly. Let’s take a dive into the research area of music information retrieval and see how Python enables some of its proudest achievements. You’ll learn about common MIR tasks and get ideas on how you can analyze, generate and interact with music using code, so you can start exploring right away! No music theory knowledge nor prior experience with MIR is expected.
Magic happens every time you take your phone out of your pocket. Somehow, just by looking at the screen, your phone recognizes you (and only you) and magically unlocks.
Have you ever stopped for a minute and thought to yourself - How does that even work? And maybe more importantly, how secure is it?
In this session, we're going to understand how facial recognition works under the hood. We'll dive into some potential security problems, and we'll show you how we were able to break into a biometric database built on the Dlib-python-library by applying a sophisticated brute-force attack. The results will surprise you.
Open source has widely grown to allow different tech career paths to enhance projects with their skills & provide jobs for those interested in working with open source. Open source contribution programs provide & build interested persons' capacity to become professionals.
Active community participation helps enhance career growth.
Outreachy is a paid and remote internship OS program that empowers, grows talents, and prepares them for career growth. Outreachy provides internships to people subject to systemic bias and impacted by underrepresentation in the technical industry where they are living.
At the end of this session, beginners and persons on the intermediate level will have enough knowledge of how they can build a career in open source; experts will also get more insights on how they can contribute to the advancement of open source contribution by giving back to the community as a mentor helping new contributors understand the open source ecosystem and contribution.
Music streaming services like Spotify and youtube are famous for their recommendation systems and each service takes a unique approach to recommending and personalize content. While most users are happy with the recommendations provided, there are a section of users who are curious how and why a certain track is recommended. Complex recommendation systems take various factors like track metadata, user metadata, and play counts along with the track content itself.
Inspired by Andrej Karpathy to build an own GPT, we have to use Language Models to build our own music recommendation system.
Did you know that originally programming was a female-heavy field? How did we get to the stereotype of the antisocial programmer (and therefore male)?
How the concept that good programmers appeared to have been “born, not made” is still affecting our tech industry and society.
In the recent years we saw an explosion of usage of Python in the browser: Pyodide, CPython on WASM, PyScript, etc. All of this is possible thanks to the powerful functionalities of the underlying platform, WebAssembly, which is essentially a virtual CPU inside the browser.
Python’s expressive syntax, ease of use, and powerful ecosystem of third-party packages are all major contributing factors to its thriving use for accelerator controls at CERN. Providing access to this rich ecosystem in a protected environment, whilst also allowing developers to augment this with internally developed packages is a key enabling service. Existing open-source solutions didn’t meet our needs, and the evolving Package index standardisation, as well as exposure to dependency confusion attacks, left us searching for a more modular and flexible approach.
In this presentation we will demonstrate the Python package upload, index, and browsing services developed at CERN. We will discuss the gradual transition from our existing repository service (based on Nexus), and demonstrate - with the help of recent packaging PEPs - the flexibility that modularising the services has brought, helping us to meet our needs for local specialisation and enhanced security measures.
As a developer, you play a crucial role in the security of your projects. At the same time, it can be difficult to know if what you’re doing is enough. Luckily, you don’t have to be a security expert to contribute to the security of your projects. Instead, you can use industry standards as a guide for your approach to security.
In this talk, I will introduce you to a framework that is especially accessible to developers, the OWASP DevSecOps Maturity Model, and help you get started with a systematic approach to improving the security of your projects.
Many modules you use and love have a portion of their implementation written in other languages, and for that a Python extension need to be made. Python offers a C-API that allow people extending the language, and being a nice glue-language, C is also a bridge to many other languages as well.
So if everything is simple, what's the deal with stability? Changes in the C-API might break the functionality in older versions, so PEP 387 saves the day with a policy for backward compatibility. Starting from Python 3.2, the Limited API was introduced, which defined a subset of Python's C-API that it's promised that if used, the code can be compiled in one version, and run in many others as well.
Also, having a Stable ABI compatible wheel, allow you to only have one-wheel-per-OS, and not one-wheel-per-python-version, which can simplify your release process.
This talk will introduce the Limited API concept, and provide the necessary information to include it in your project.
There has been a renaissance around Artificial Intelligence systems in recent years. However, despite the hype, only a small percentage, i.e. 13% of Machine Learning models see the light of day! Well, effectively building and deploying machine learning models is more of an art than science! ML models are indeed inherently complex, have fuzzy boundaries, and rely heavily on data distribution. But what if they are trained on biased data? Then they’ll generate highly biased decisions! As the famous saying goes by, “Garbage in, garbage out,” so if the model is trained on skewed and unfair data distribution, they are bound to produce fuzzy output! So, join me in this talk as I will share my learnings in developing effective practices to build and deploy ethical, fair and unbiased machine learning models into production.
In this talk, we will present DuckDB. DuckDB is a novel data management system that executes analytical SQL queries without requiring a server. DuckDB has a unique, in-depth integration with the existing PyData ecosystem. This integration allows DuckDB to query and output data from and to other Python libraries without copying it. This makes DuckDB an essential tool for the data scientist. In a live demo, we will showcase how DuckDB performs and integrates with the most used Python data-wrangling tool, Pandas. Besides learning about DuckDB's main charactestics, users will also experience a live demo of DuckDB and Pandas in a typical data science scenario, focusing on comparing their performance and usability while showcasing their cooperation. The demo is most interesting for an audience familiar with Python, the Pandas API, and SQL.
Recently, most works focus on synthesizing independent images; While for real-world applications, it is common and necessary to generate a series of coherent images for story-telling. In this work, we mainly focus on story visualization and continuation tasks and propose AR-LDM, a latent diffusion model auto-regressively conditioned on history captions and generated images. To my best knowledge, this is the first work successfully leveraging diffusion models for coherent visual story synthesizing.
As the number of production machine learning use-cases increase, we find ourselves facing new and bigger challenges where more is at stake. Because of this, it's critical to identify the key areas to focus our efforts, so we can ensure our machine learning pipelines are reliable and scalable. In this talk we dive into the state of production machine learning, and we will cover the concepts that make production machine learning so challenging, as well as some of the recommended tools available to tackle these challenges.
Machines have become smarter than ever before. Recently, we have started using computers for solving problems beyond computations, and it might not be wrong to call them electronic creators. The future laptop might have a prompt based word application, replacing the current Word, where one has to type their thoughts and formulate an entire document from scratch. Similarly, we might see a prompt based Paint application, instead of the typical Paint program, that generates the paintings for us. In my opinion, AI-based applications are not in fiction anymore, and we may soon be using them on our computers. However, there is a possibilty that the Generative AI can be potentially harmful for society. We need to explore the ethical concerns, and how the AI can impact our society. In this talk, we will try to understand how Generative AI is becoming a part of our future and how we can use it in a responsible and ethical manner.
Python is a popular language for data engineering but has some limitations in performance, concurrency, and production deployments. The Rust programming language offers powerful alternatives with strong compile-time and memory safety guarantees. In this talk, I'll explore how data engineers can leverage Rust to build high-performance data pipelines and processing systems. I'll cover the Rust ecosystem for data work, including frameworks and libraries for working with data formats, databases, streaming systems, and scientific computing. By combining Rust and Python, data engineers can harness the benefits of both languages and build robust end-to-end data systems that scale to meet demanding production needs.
This talk presents Appeal, a new library for command-line parsing in Python. Appeal avoids the cumbersome APIs and repetition endemic to the currently prevalent libraries in this space by leveraging Python's own function call interface. This talk will familiarize the audience with Appeal, its motivation, its approach, and its expressive power, and show them how to use Appeal in their own programs.
As developers, we learn early on that it’s important to focus on getting our code to work without unnecessarily pre-optimizing, but how do we learn to eventually optimize our code? What do we look for? How do you know when something is slow? How do you do something about it?
In this talk, we’ll discuss why your application performance matters, how you can learn to identify what matters most to you, and how Sentry has you in mind so you can effectively spend time improving the performance of critical user flows in your application.
At LocalStack, we are building a platform that enables development and testing of cloud applications on your local machine. The core is an open source AWS emulator that is primarily written in Python. It is among the top Python projects on GitHub, and has seen a massive uptake in contributions over the past two years. Many Python software developers and architects will relate to the struggles of maintaining a large and complex Python codebase, while keeping developer teams productive. In this talk, we'll explore how we at LocalStack tackle these as we re-create AWS for local development. We'll explain our approaches to automating around AWS specifications, building a highly modular and pluggable system to make it easy for teams to integrate their components, the software patterns we use to keep devs productive, as well as our approach to automated contract testing using pytest.
If you're like me, then you've long known about Python's "logging" module, but you've ignored it because it seemed too complex. In this talk, I'll show you that "logging" is easy to learn and use, giving you far more flexibility than you can get from inserting calls to "print" all over your code. I'll show you how you can start to use "logging" right away -- but also how you can use it to create a sophisticated logging system that sends different types of output to different destinations. After this talk, you'll know how to use "logging", and you'll be less likely to use "print" in your applications.
Interactive control of robots can be a challenge, as it requires a lot of things to happen in parallel while at the same time reacting to data from sensors and control signals. Using python's async facilities may greatly simplify this task, allowing us to write code that is similar to the non-parallel version, but that is at the same time easy to compose into bigger program doing many things at once. I will talk about my own experiences programming the Fluffbug robot with CircuitPython, point out the problems and the solutions I found.
Want to learn something new about yourself? This talk will showcase some approaches to get the best from behavioral tracking as well as silent wearables tracking. Where and how to get data with my experience regarding the quality (expectation management), what to do with the raw data (IDA + some knowledge needed), how to convert insights into actions.
We will explore the latest research on how children gain programming knowledge, how to keep them interested and excited, and how this might inform the way we support adult newcomers to programming. Practical advice and suggestions for activities will be given to attendees.
Do you need to transform, optimize and scale your data workflow? In this talk, we’ll review use cases, and you’ll learn how to dynamically generate thousands of DAGs (Directed Acyclic Graphs) with Airflow.
This talk will introduce dbt and demonstrate how to leverage Python to unlock its full potential. Attendees will learn best practices for working with dbt, how to integrate it with other tools in their data stack, and how to use Python packages like fal to perform complex data analysis. With real-world examples and use cases, this talk will equip attendees with the tools to build a modern, scalable, and maintainable data infrastructure.
With the release of Python 3.11 in October 2022, PEP 654 "Exception Groups and except" was accepted, and asyncio.TaskGroup() was added. This enhancement of exception and cancellation handling has allowed asyncio to evolve more flexibly, addressing the existing issues with asyncio APIs, such as insufficient cancellation and exception handling in asyncio.gather.
In this talk, I would like to discuss the problems of existing asyncio APIs and how the newly introduced asyncio.TaskGroup() solves these issues. Attendees will learn about the improved way of handling exceptions and cancellations using asyncio.TaskGroup(), enabling them to write more efficient and robust asynchronous code with Python 3.11.
Inspired by Xarray, Scipp (scipp.github.io) enriches raw NumPy-like multi-dimensional data arrays by adding named dimensions and associated coordinates. For an even more intuitive and less error-prone user experience, Scipp adds physical units to arrays and their coordinates. Scipp data arrays additionally support a dictionary of masks, as well as histogram bin-edge coordinates.
One of Scipp's key features is the possibility of using multi-dimensional non-destructive binning to sort record-based "tabular"/"event" data into arrays of bins. This provides fast and flexible binning, rebinning, and filtering operations, all while preserving the original individual records.
Scipp ships with data display and visualization features for Jupyter notebooks, including a powerful plotting interface. Named Plopp, this tool uses a graph of connected nodes to provide interactivity between multiple plots and widgets, requiring only a few lines of code from the user.
Many of our sponsors are looking to hire talented people and EuroPython is the perfect place to reach out to them!
In this session, our sponsors will each give a short presentation about their company and what they do with Python. You will meet and hear the exciting opportunities from JetBrains. Kraken Technologies, Microsoft, Optiver, Sentry, Temporal Technologies, Google Cloud, Numberly and Arm. Afterwards, you can approach them directly in their sponsor booth to carry on the conversation!
Bring your GraphQL APIs to life with real-time data using Strawberry! 🌟 In this talk, we'll dive into GraphQL Subscriptions and explore how to leverage WebSockets for interactive, real-time updates. Say goodbye to constant polling and hello to efficient, seamless communication!
- Understanding GraphQL Subscriptions and their role in real-time data delivery.
- Setting up WebSocket connections and integrating them with your GraphQL server using Strawberry.
- Designing subscription schemas and handling server-side events for seamless updates.
- Enhancing client-side experiences with real-time data and updates.
Did you know that Python conferences are primarily organized by the community? Go backstage and join us at the Python
Join us for an engaging and insightful discussion as we bring together a group of passionate Python conference organizers from the community. Discover the vibrant ecosystem behind Python conferences and gain valuable insights into their experiences, motivations, and learnings.
Whether you are an aspiring organizer, a Python enthusiast, or simply curious about the inner workings of community-driven events, this panel promises to provide a wealth of knowledge and inspiration. Don't miss the opportunity to hear firsthand from the dedicated individuals who make Python conferences an incredible experience for all!
Come meet the folks who make the Python programming language!
A panel discussion of core Python developers will take place on Wednesday at 2pm. Hear what's on their mind, what they're working on, and what the future holds for Python.
The panel will include:
- sitting Steering Council member Pablo Galindo Salgado;
- cybersecurity expert and aspiring core developer Marta Gómez Macías who made f-strings much better in 3.12;
- CPython's Windows expert Steve Dower;
- Red Hat veteran and emeritus Steering Council member Petr Viktorin;
- and the tech lead of Microsoft's "Faster Python" team Dr. Mark "HotPy" Shannon.
The panel will be chaired by Łukasz "Any-color-you-like-as-long-as-it's-black" Langa.
Python offers decorator to implement re-usable code for cross-cutting task. The support the separation of cross-cutting concerns such as logging, caching, or checking of permissions. This can improve code modularity and maintainability.
This tutorial is an in-depth introduction to decorators. It covers the usage of decorators and how to implement simple and more advanced decorators. Use cases demonstrate how to work with decorators. In addition to showing how functions can use closures to create decorators, the tutorial introduces callable class instance as alternative. Class decorators can solve problems that use be to be tasks for metaclasses. The tutorial provides uses cases for class decorators.
While the focus is on best practices and practical applications, the tutorial also provides deeper insight into how Python works behind the scene. After the tutorial participants will feel comfortable with functions that take functions and return new functions.
This tutorial aims to demystify asyncio builtin module by implementing it from scratch without any dependencies other than the Python Standard Library. We will go through the problem of blocking IO and how it is possible to solve it without single "async" and "await" statement using native Python concepts. Then, we will demystify async/await syntax and see how it is implemented. We will also build our own scheduler which will have a similar API as asyncio, which will be able to run async functions the same way asyncio does. And finally we will build real asynchronous http proxy using our own asyncio implementation. Why reinvent the wheel? - "I hear and I forget. I see and I remember. I do and I understand.".
Python 2 to Python 3 migration used the D-day approach which failed. We learnt from our mistake and we are introducing incompatible changes differently now. Document changes, provide a way to write code compatible with the old and the new way, tooling to ease the migration, design long term approach to reduce the need for incompatible changes.
Our EuroPython takes place in Prague - a city with some lessons for us, about programming, software and technology. More than 100 years ago Prague produced buildings that hint at how far our ideas in software might take us, and writers and artists who imagined challenges that have lately become real.
The Django Admin Panel is a complex and bad-documented tool in the Django that can greatly speed up development if you start to understand it. “Isn’t it easier for us to write our Backend?” I will answer: “No, it’s not easier!”. 8 years of insights and discoveries in my Talk. Here i want talk about multiple admin sites, ModelAdmins possibilities, object state versioning and app configs as completely forgotten hidden power.
Finding good datasets or web assets to build data products or websites with, respectively, can be time-consuming. For instance, data professionals might require data from heavily regulated industries like healthcare and finance. In contrast, software developers might want to skip the tedious task of collecting images, text, and videos for a website. Luckily, both scenarios can now benefit from the same solution, Synthetic Data.
Synthetic Data is artificially generated data created with machine learning models, algorithms, and simulations, and this workshop is designed to show you how to enter that synthetic world by teaching you how to create a full-stack tech product with five interrelated projects. These projects include reproducible data pipelines, a dashboard, machine learning models, a web interface, and a documentation site. So, if you want to enhance your data projects or find great assets to build websites with, come and spend 3 fun and knowledge-rich hours in this workshop.
The more applications you build, the more libraries you share, the more they become an excuse to meet people and have fun together.
What makes it fun? And how do we keep it fun?
The Future of Microprocessors - a talk about the history of microprocessors, how we got here and what might happen next. There will be two laws, one equation, some graphs and a particle beam weapon out of Star Trek.
In a world, full of Micro-Services, distributing tasks is a constant challenge, and there's only one tool that can rule them all.
In this workshop, we'll introduce Celery - a tool for distributing tasks in an easy, fast, and flexible manner, and take you from zero to hero!
- We're going to understand why we need a distributed task system, and why to choose Celery.
- We'll write our first Celery task.
- Understand how to configure and run Celery.
- Familiarize ourselves with Celery's fundamental concepts.
- Dive into celery customizable options.
- Finally, we'll see a real-life example of how we used Celery in our production system and how we customized it to fit our needs, and discuss how you can do the same.
Learn how to build powerful terminal-based user interfaces (TUIs) with ease using Textual - an open-source Python framework.
Throughout this tutorial, you'll learn how to use Textual's built-in widgets, reactive features, and message-passing system to create a dynamic and user-friendly TODO app that's perfect for managing your daily tasks.
From creating and displaying tasks to editing and deleting them, you'll cover all the essential features needed to make a functional TODO app.
You'll also learn how to use Textual CSS to style your TUI for a polished and elegant look, together with some tips and tricks to make it even easier to develop your TUIs in Textual.
This tutorial provides everything you need to get started with building TUIs in Python. By the end of the tutorial, you'll have a fully functional and stylish TODO app that showcases Textual's versatility and useful features.
pytest lets you write simple tests fast - but also scales to very complex scenarios: Beyond the basics of no-boilerplate test functions, this training will show various intermediate/advanced features, as well as gems and tricks.
To attend this training, you should already be familiar with the pytest basics (e.g. writing test functions, parametrize, or what a fixture is) and want to learn how to take the next step to improve your test suites.
If you're already familiar with things like fixture caching scopes, autouse, or using the built-in tmp_path/monkeypatch/... fixtures: There will probably be some slides about concepts you already know, but there are also various little hidden tricks and gems I'll be showing.
You don't have to be an Ops expert to make Kubernetes useful! In this workshop, you will learn how to overcome complexity, and love Kubernetes as a Platform to deploy a Python web application or your data science and machine learning pipelines. You will learn how and when to use basic elements of Kubernetes like Deployments and Stateful Sets.
Once you understand these basic elements, you will learn how to extend Kubernetes using Python. You will learn how to define custom resources and controllers to automate all things related to your applications' life cycle, from ETL through sending email for password reset to where your imagination stops.
In the end of this workshop, you will have deployed a python web application and successfully extend Kubernetes with so-called operators to manage the complete life-cycle of your application.
Ever seen a code base where understanding a simple method meant jumping through tangled class hierarchies? We all have! And while "Favor composition over inheritance!" is almost as old as object-oriented programming, strictly avoiding all types of subclassing leads to verbose, un-Pythonic code. So, what to do?
The discussion on composition vs. inheritance is so frustrating because far-reaching design decisions like this can only be made with the ecosystem in mind – and because there's more than one type of subclassing!
Let's take a dogma-free stroll through the types of subclassing through a Pythonic lens and untangle some patterns and trade-offs together. By the end, you'll be more confident in deciding when subclassing will make your code more Pythonic and when composition will improve its clarity.
Looking for a job is already a job. How can you make sure that you are successful in the role of a Python Developer job-seeker? Join this talk to learn directly from an insider the tips & tricks about what technologies are in-demand, how to look for your next role, how to display your experience (or lack of) in your CV, how to prepare for interviews, and much more.
You use the Python interpreter every single day. It does a lot of things for you: checks that your code has valid syntax and is properly indented, imports modules from various locations, and runs your code instruction-by-instruction.
But if you've ever wondered how exactly it happens, this talk will teach you the entire process, by building a working python interpreter from scratch.
The topic aims to introduce participants to the latest from Python in version 3.11, released in early October 2022, which includes:
• Speed improvements; • Standard Libraries Improvements; • Self type; • Exception Notes; • Better Error Messages; • Improved Type Variables; • Variadic generics; • Marking individual TypedDict items as required or potentially missing; • Arbitrary literal string type; • Data class transforms; • TOML read-only support in stdlib; • Exception Groups; • Negative Zero Formatting.
For millennia, humans have known things. Pretty quickly, we started writing them down; our brains aren't very good at storing all the things we know reliably, and we needed something more durable.
A long time ago, this meant clay tablets with cuneiform on them, and things have only got more complicated from there. Nowadays, we try to store data so that computers can understand it too, and that's given us a bewildering array of options - portable hard drives, magnetic tape storage and so much more.
In this talk, we'll take a look at the history of data storage, and discuss why some methods have worked better than others. We'll talk about why writing things down for humans is different than doing it for computers, and why it's difficult to do both at the same time (this is what code is). Finally, we'll look at today's state-of-the-art for keeping data safe, and discuss what the future might hold.
This talk has no prerequisites, although a fondness for weird facts will help!
Are you a data scientist or developer working in healthcare? Are you tired of dealing with proprietary data formats for biological and vital sign information? It's time to unlock the power of open data and make your research more impactful.
In this talk, we'll explore how you can leverage Python analytics to manipulate and analyze complex datasets of patient information, including blood work, ECG, EEG, echocardiography, radiography, and more.
We'll also dive into the world of open data formats, and show you how using these formats can make it easier to anonymize, convert, and collaborate on research.
Don't miss this opportunity to learn how Python analytics and open data formats can help you unlock the insights hidden in your data and improve patient outcomes.
In this talk we will take you through a complete journey a website takes - from conception to running in production, the right way.
What is the best setup for local development, how to then move to testing and production? An opinionated talk from two veterans.
We're entering the age of machine-generated art. Many of the new systems are shockingly impressive but impossible to replicate by individuals because they rely on complex machine learning techniques with huge datasets that aren't feasible to do in a home environment. Fortunately, there's an entire group of clever approaches to generate graphics that look cohesive, unique, and deliberate... and that you can easily do on your own computer.
In this short talk we'll go through a few of those algorithms like Clifford attractors, slime mold simulation, and reduction of source imagery to geometric primitives. We'll generate images and animations, we'll dabble in 2D and 3D. You'll leave the talk with your own ideas how to create attractive visualizations out of thin air. The talk assumes familiarity with Python and high-school math.
Right after the devastating earthquakes in Turkey, there has been a massive flow of tweets and posts from survivors and their relatives, calling for help. There was a need to extract the data, make it meaningful and open to public, so we have come up with afetharita.com. The machine learning part of the application is completely based on open-source tools in Python and I will go through the pipeline and the process.
Currently, SQL and Cloud Data Warehouses (DWH) are extremely popular for good reason. They are great for dashboarding and business intelligence (BI) use cases due to their ease-of-use. However, their combination might not be the best choice for every problem. More precisely, business-critical data pipelines with high complexity might be better suited for frameworks such as Apache Spark which greatly benefit from the tight integration with general purpose languages like Python (e.g., PySpark).
Expect an opinionated comparison between Apache Spark and seemingly easier-to-use cloud native SQL engines. By the end of this talk, you will be challenged to think about why they are complementary and when each has its justification.
Python packaging is quickly evolving and new tools pop up on a regular basis. Lots of talks and posts on packaging exist but none of them give a structured, unbiased overview of the available tools.
This talk will shed light on the jungle of packaging and environment management tools, comparing them on a basis of predefined features.
We'll cover the basics of Rust and demonstrate how to create a Rust module that can be imported and used within Python. Discover the advantages of using Rust in Python, especially regarding improved performance.
What would a Pythonista gain from becoming a Rustacean other than semicolons and brackets?
In this talk I'll share the learnings and achievements I got by adding the Rust programming language into my Python life. Illustrating a real story now in production at scale, I'll walk you through all the pains and joys of this unexpected journey which changed me more than I anticipated.
- Project introduction
- Motivations of selecting this project to learn Rust
- Tales of a Pythonista learning Rust
- Results, numbers and production graphs
- How Rust influences my daily Python
- Was it worth it? Should you do it too?
There are two hard problems in programming: naming things and cache invalidation. I'll cover the latter, in a microservice-based system. Given a fairly standard setup with API Gateway and a backend service with its own database, I'll show how to implement cache that allows us to avoid database queries without modifying API client.
The whole talk is based on live coding.
Controllers deal with numbers all day long. They have to check a lot of data from different sources. Often the reports contain erroneous or missing data. Identifying outliers and suspicious data is time-consuming.
This presentation will introduce a Small Data Problem-End2End workflow using statistical tools and machine learning to make controllers' jobs easier and help them be more productive.
We will demonstrate how we used amongst others,
Learn to secure your Django apps by attacking (and then securing) Pygoat - An intentionally vulnerable Python Django application. Explore the OWASP top 10 vulnerabilities and understand how to mitigate them from Django apps.
This summit aims to bring together maintainers and users of the Python with WebAssembly, to discuss the state of this ecosystem, existing challenges and ongoing work.
Find out more, including how to sign up, here: https://ep2023.europython.eu/wasm
How do you implement Infrastructure-as-Code (IaC) in a non-cloud environment?
Large Language Models (LLMs) have shown some impressive capabilities and their impact is the topic of the moment. What will the future look like? Are we going to only talk to bots? Will prompting replace programming? Or are we just hyping up unreliable parrots and burning money? In this talk, I'll present visions for NLP in the age of LLMs and a pragmatic, practical approach for how to use Large Language Models to ship more successful NLP projects from prototype to production today.
Updating Python versions often forces us to update native extensions at the same time. But what if you need to update Python because of a security issue, but cannot (yet) move to a newer version of a dependency? Or you are running a proprietary binary extension that cannot easily be recompiled?
The HPy project provides a better C extension API for Python. It compiles to binaries that work across all versions of CPython, PyPy, GraalPy. HPy makes porting from the existing C API easy and its design ensures that the binaries we produce today stay binary compatible with future Python versions.
NumPy is the single largest direct user of the CPython C API we know of. After over 2 years of work and more than 30k lines of code ported, we can demonstrate NumPy running its tests and benchmarks with HPy. We will show the same NumPy binary run on multiple CPython versions and GraalPy. And we will discuss performance characteristics of this port across CPython, GraalPy, and PyPy.
Are you afraid of AI? Are you afraid of your own government? Are you just a great developer who practices decent devops and wants to know how you might wind up helping the people who answered "yes" to the previous two questions? I'll review the nature of intelligence, ethics, and EU digital regulation, then we can all talk about how coders can help make the planet work better on solving problems like sustainability and peace.
Trans*Code is an international hack event series focused on building connections and community while exploring the tech side of transgender issues and opportunities. Coders, designers, activists, visionaries of all sorts, and community members not currently working in technology are all welcome and encouraged to participate. FInd more details including how to sign up here: https://ep2023.europython.eu/trans_code
Are you new to EuroPython or any Python conference? You must have a lot of questions like:
- What is a Lightning Talk?
- What is an Open Space?
- Besides going to talk, what else can I do?
- Why everyone seems to know each other, how can I join in conversations?
Don't worry, we are here to help you get the most out of your first EuroPython experience. Come to join us in this Beginner Conference Orientation that is tailer made for you.
✨ We are running mentored sprints for diverse community members at EuroPython this year ✨
👉🏽 Apply to be a mentor on the day or feature your open source project: email email@example.com
👉🏽 Apply to participate as a contributor (sprint on the day): fill in this form.
If you are not a member of an underrepresented group of the community and want to take on the sprint we encourage you to bring someone from an underrepresented group with you.
👉🏽Volunteer for our Git Helpdesk (4 volunteers needed): Help new contributors with git/GitHub, such as cloning repo, creating branch, committing, resolving merge conflicts. fill in this form
📖 Check out our online handbook to learn more about our approach to sprinting: https://mentored-sprints.netlify.app/
Many developers avoid using generators. For example, many well-known python libraries use lists instead of generators. The generators themselves are slower than normal list loops, but their use in code greatly increases the speed of the application. Let’s discover why.
We will follow master detective Robot Holmes on his way to solve one of his hardest cases so far - a series of mysterious murders in the city of MLington. The traces lead him to the Vision-Language part of town, which has been a quiet and tranquil place with few incidents until lately. For a few months the neighbourhood has been growing extensively and careless benchmark leaders are dropping dead at an alarming rate.
Robot Holmes sets out to find the cause for this new development and will gather intel on some of the most notorious of the new citizens of the Vision-Language neighbourhood and find out what makes them tick.
This tutorial presents sktime - a unified, open source framework for machine learning with time series in python. sktime provides interfaces to algorithms of various types, and modular tools for pipelining, composition, and tuning. You will learn how identify your learning task, and how to build, use, and evaluate different algorithms on real-world data sets.
All tutorial notebooks are available in this repository and runnable from the cloud: https://github.com/sktime/sktime-tutorial-europython-2023
A Jupyter notebook is quite handy for rapid REPL (Read-Eval-Print-Loop) style tasks such as exploratory data analysis and data science. However, we would feel deficiencies in proper SW engineering supports at some point as the notebook grows to have larger and more complicated code. It is because the Jupyter notebook lacks several important features including code sharing, refactoring support, version control and advanced editing. Fortunately, traditional full-fledged IDEs, such as VS Code or PyCharm, are available at hand and they support these lacking features very well. Then, why don’t we take advantage of the best of both worlds?
In this beginner-level hands-on talk, I will demonstrate how to transform Jupyter notebook workflows to a proper Python package using VS Code. I will also introduce several basic but essential refactoring recommendations. By doing so, you can use the package for several notebooks and even share with your colleagues and friends.
Are you just getting started in the world of data science and feeling overwhelmed by the abundance of information on various packages, models, and techniques? Perhaps you're finding it challenging to decide which visualization package to use or which tools to begin with. Maybe you're puzzled by the distinctions between pip and Conda, or you're feeling bombarded by all the news about AI and large language models.
Worry no more! Join us for this Q&A session, where a panel of data science experts will be there to address all of your pressing questions. This session is designed to create a relaxed and welcoming environment for complete beginners in the field, offering guidance on topics that might be causing confusion.
Are you a complete beginner to coding, but would love to learn how to get started? Have you been curious about data science, but feel overwhelmed with all the talk of AI? Many people working in data science were once in the same position and know how hard it is to take those first steps.
However, fear not! It’s easier than you think to get started, and to help make this transition easier we’re hosting a workshop to teach beginners how to get started in Python and data science. We’ll start with Python basics, and show you how you can use this knowledge to easily read and transform data using Pandas, Python’s powerful data analysis library. You’ll also see how you can create beautiful, customized visualizations in Python using packages such as Matplotlib and Seaborn. You’ll also learn how to use a core data science tool, Jupyter notebooks, to run and check the output of your code. See more details on the workshop here: https://ep2023.europython.eu/humble-data
You don't know comprehensions!
Why are list comprehensions good? Because they are fast?
Because they are short?
This poster session will show why list comprehensions are an excellent Python tool.
The poster session will:
- teach you to convert loops to list comprehensions;
- show how to write list comprehensions from scratch; and
- give 10+ actionable tips and tricks for list comprehensions.
The poster will also show the similarities between list comprehensions and:
- set comprehensions;
- dict comprehensions; and
- generator expressions.
There are two values that everyone agrees with: Judicial Truth (criminals should be prosecuted, but innocent people left free), and Privacy (others shouldn't know unnecessarily about my private life).
But these two values are constantly put in opposition, e.g. videosurveillance helps gather evidence of crime, but it endangers our legitimate rights as citizens.
That's why we launched the WitnessAngel initiative, a research effort to invent new concepts and technologies able to reconcile Judicial Truth and Privacy.
With algorithms like Flightbox, with ideas like VideoTestimony and Familiar, and with the open-source code we provide, we work with associations and enterprises to eventually put life-changing solutions into the hands of the general public. So that countless victims of rape, abuse, bullying, stop facing the usual brick wall: "it's your word against theirs".
This talk will focus on implementing a password rotation strategy for your database without disrupting your Django server or other applications that consume the database. Regular password rotation is a critical security practice, but it can pose challenges for applications and servers that rely on the password for access. We will discuss the importance of password rotation and explore the challenges of rotating passwords for a database in use by a Django server. We will also discuss several techniques for safely rotating database passwords, such as using connection pools and leveraging environmental variables. By the end of the session, attendees will better understand the security risks associated with static passwords and how to mitigate those risks through password rotation while keeping their Django server and other applications running smoothly.
Multilabel classification is a machine learning task in which each instance is assigned to a group of labels. It has gained widespread use in various applications in recent years. Preprocessing, such as feature selection, is an important step in any machine learning or data mining task. It helps to improve the performance of an algorithm and reduce computational time by eliminating highly correlated, irrelevant, and noisy features. A new algorithm called Black Hole, inspired by the phenomenon of black holes, has recently been developed to tackle multi-label classification problems. In this talk, we present a modified version of the Black Hole algorithm that combines it with two genetic algorithm operators: crossover and mutation. The combination of Black Hole and genetic algorithms has the potential to solve multi-label classification problems across a range of domains.
Discover how coding conventions can enhance code quality, readability, maintainability, and reduce errors. Join us as we discuss the creation and implementation of coding conventions, and how to use linters for maintenance.
How to get familiar with codebase you need to maintain with minimum suffering? How to leave codebase easier to deal with for your colleagues so they don’t have to suffer like you did?
If you are experienced developer or a junior just starting your journey, inheriting codebase can be a very challenging task. Especially if the codebase is not quite up to your standards, or it’s just huge and complex beast.
I will convey my experience and tips and tricks on inheriting code I acquired during 12 years of software development on new and old projects.
The talk will provide guidelines to ease taking over code from somebody else, as well as remind developers of the importance that planning, preparation and documentation have in facilitating code change and project growth.
Arm is everywhere technology matters: 250+ billion chips in everything from sensors to smartphones to servers. Due to its simplicity, versatility, and growth in popularity over the past decade Python is the most used language in the world.
In this presentation I will show you what the status of Python is on Arm architecture on all major operating systems and how you could help to improve it further.
Ritchie Vink is the Author of the new Polars DataFrame library. The library that is built for modern hardware.
Polars is a query engine written in Rust that focusses on the DataFrame front-end. It is written from scratch in Rust designed to be fast, parallel and memory efficient. This talk we'll go through in the design of Polars and some of its design decisions.
We will explore possibilities for making our data analyses and transformations in Pandas robust and production ready. We will see how advanced group-by, resample or rolling aggregations work on large time series weather data. (As a bonus, you will learn about Prague climate.) We will use type annotations and schema validations with the Pandera library to make our code more readable and robust. We will also show the potential of property-based testing using the Hypothesis package, with strategies generated from Pandera schemas. We will show how to avoid issues with time zones when working with time series data. By the end of the tutorial, you will have a deeper understanding of advanced Pandas aggregations and be able to write robust, production ready Pandas code.
The NVDA screen reader is a Python application packaged with Py2exe, along with C++ extensions for low-level system access and improved performance. Its functionality can be expanded through addons that are also written in Python, which makes the ability to debug both the core and addon code highly desirable.
However, debugging code within an embedded or packaged Python environment can be quite challenging, especially if you are a visually impaired programmer trying to debug your own screen reader, since hitting a breakpoint will freeze the tool you rely on for computer access!
In this presentation, I will demonstrate how I addressed this challenge by leveraging Microsoft's debugpy library for remote debugging. I will showcase how this technique can be used to debug Python applications running within an embedded Python environment, regardless of the host language. Additionally, I will explore its applicability in debugging applications running on different operating systems or environments than the one where you prefer to use your debugging IDE.
The EU Cyber Resilience Act (CRA) may have a huge impact on the open-source community. There are concerns about how this framework would be applied to the open-source software contribution and distribution. If you would like to know more and voice out your concerns, join our sessions with leaders in the Python community and experts in the field.
It's great when you can share the results of your analysis not only as a presentation but as something that non-data scientists can explore on their own, looking for insights and applying their business expertise to understand the significance of what they find.
With its accessibility for both creators and viewers, Streamlit offers a brilliant platform for data scientists to build and deploy data apps. Now, with the integration of ipyvizzu - a new, open-source data visualization tool focusing on animation and storytelling - you can quickly create and publish interactive, animated reports and dashboards on top of static or dynamic data sets and your models.
Are you tired of writing complicated code only to discover that Python has tools in its standard library that could have made your life easier? Join us for a tour of the standard library where we'll dive into less-known modules that do well-known things and well-known modules that do less-known things. This talk is tailored to beginners or anyone who wants to learn more about Python's standard library.
This talk will provide an introduction to Integer Programming and demonstrate how it can be used for conference scheduling. We will explore the basics of Integer Programming and how it can be applied to optimize the allocation of talks to time slots and rooms in a conference program. By the end of the talk, attendees will have a better understanding of how this powerful tool can help to create an efficient and effective conference schedule that maximizes attendee satisfaction. Whether you're a conference organizer or simply interested in learning more about optimization algorithms, this talk is for you!
Information is abundant and readily available on the internet. However, the sheer amount of data can be overwhelming and time-consuming to navigate through. That's where web scraping comes in - a powerful tool used to extract data from websites and turn it into a usable format.
In this tutorial, we will explore the basics of web scraping and how to implement it using Scrapy (a Python framework). Whether you are a data analyst, programmer, or researcher, this tutorial will equip you with the fundamental skills needed to create your own web scraper and extract valuable information from websites.
The djangoproject.com website is the showcase of the Django project and is the result of contributions from many people. In this talk, we'll update you on its development and learn how to contribute to it.
Distributed databases are widely used in modern applications for their high availability and scalability. Have you ever wondered how data integrity is maintained with the data across multiple nodes? One of the key components of achieving this is distributed consensus. Raft is a widely used consensus algorithm that provides a fault-tolerant and highly available system. In this talk, we will explore how to implement Raft consensus using the rqlite distributed database in python.
Computer pRogramming, Technology, Bit-coinism Success, Climate Change and Billionaires are all associated with one another. This talk will describe how a cohort of 299+ young people (aged 11-14) were introduced to Python Programming, at the same time, for the very first time. And in this talk, I would like to share with a great secret in that I have actually learnt more from the young students than they learnt from me. This talk is about how these young people have opened my eyes, mind and heart about alternative ways of looking at and appreciating:- The humble IF statement; the under-rated FOR loop, the dry Return Statement, the functional Maths & Random modules, etc. as if one were an artist. We will talk about the renewed delight of looking at these things from fresh pairs of eyes and how we can take these new learnings forward.
Pymoo is an open source python framework with state-of-the-art optimisation and post performance analysis capabilities. It provides an object oriented interface to solve constrained Single/Multi-Objective optimisation problems with a catalog of algorithms, customisations and post-optimisation evaluation functionalities. With additional features like Visualisation of optimal pareto-fronts, decision making, parallelization and customised sampling, Pymoo promises to be highly valuable for scalable optimisation solutions.
HTMX has been quite popular lately in the Django circles and has demonstrated how powerful it can be with vanilla Django. But... have you thought about HTMX paired with Django REST Framework and more specifically paired with DRF's flexible renderer system?
Zero-downtime migration is a technique for running database migrations without stopping the web app. As clients' databases grow larger, applying necessary updates to the database can become time-consuming or potentially break the database schema. This talk will describe problematic operation types and provide a strategy for writing and running migrations to release new software versions without downtime.
Learn about the advantages and disadvantages of zero downtime deployment strategy, as well as best practices for implementing it in your organization. Learn how to make changes to production systems while keeping users up to date. Don't pass up this chance to optimize your software deployments.
As our understanding of the Universe is expanding, the desire to model the physics that govern cosmic evolution is more evident than ever, driving the emergence of cosmological simulations that model the Universe from the beginning of time till present day. In combination with Machine Learning, they allow for an unprecedented capability; one can train AI models on simulations, where the evolution history of galaxies is available, that can in turn be applied on real galaxies. In this work, we propose the use of Python as a ML tool, through the popular library Tensorflow, to quantify the impact of different cosmological models on the derivation of the history of galaxies. Python accompanies us at every step of the way, from creating the datasets and training the probabilistic neural networks to the visualization of the results, as we attempt to shed light on the cosmic past of galaxies, surpassing the unshakeable reality that we can only observe them at a specific moment in time.
A fully typed code base requires less test code to achieve the same level of confidence in its correctness. We'll analyze specific code examples and see how dependent types and exhaustiveness checking make certain classes of tests obsolete.
Neural networks have revolutionized AI, enabling machines to learn from data and make intelligent decisions. In this talk, we'll explore two popular architectures: Attention models and Diffusion models.
First up, we'll discuss Attention models and how they've contributed to the success of large language models like ChatGPT. We'll explore how the Attention mechanism helps GPT focus on specific parts of a text sequence and how this mechanism has been applied to different tasks in natural language processing.
Next, we'll dive into Diffusion models, a class of generative models that have shown remarkable performance in image synthesis. We'll explain how they work and their potential applications in the creative industry.
This is a good talk for visual learners. I prepared schematic diagrams, which present main features of the nerual network architectures. By necessity, the diagrams are oversimplified, but I believe they will allow you to gain some insight into Transformers and Latent Diffusion models.
Cython started as a language designed to write extension modules, and has long become the most widely used static compiler for Python, bringing C and C++ data types into the language. Use it to talk to existing C/C++ code or to bring your Python code up to C speed.
Python dunder methods – like
__init__ – are sometimes referred to as “magic methods” but they are not!
They are just regular methods!
Functions that are associated with objects and that you can call with arguments.
The only thing is... Python also calls those functions behind the scenes in certain situations!
So, let us learn what that is all about.
The acquisition and processing of images to find information is a field of multiple possibilities since the world has a lot of visual information that applied to different areas can demonstrate its great potential
Django is a framework that's been around for more than 15 years, which makes for enough legacy projects to deal with.
In this talk we'll show practical tips and tricks for how to get Django from legacy to latest & greatest.