Pointing out that “engineering does not equal programming”, implying that we can bring great “engineering” (solutioning) value without writing a line of code, he presents a modified version of a venn diagram on how people in various roles spend their time between programming, alignment, people, and “other”. I was bemused to note I currently sit squarely in the category labeled “beware” 🤨. A position that may continue throughout this year, but will need to change over time.
One way that we can deliver value as senior engineers is by knowing when, and how, to break the rules. Although I am now in management, I am staying close to the work of one of our development teams. When I saw them struggling to meet the sprint commitment, I decided to lean in. There was a tangled situation with library dependencies. The thorniest problem lay with library A, which depended on a dozen or so classes in library B. Remembering a lesson I learned from a very smart “junior” developer a few years ago, I decided that Write Everything Twice (WET) trumped Don’t Repeat Yourself (DRY) in this case.
In other words, I just copied the required code from library B directly into library A. In this case, all of the copied code was highly stable, and generally of a “utility” nature rather than being business logic. Problem solved.
However: I probably could have contributed similar value by simply suggesting that the team duplicate that code, rather than taking it on myself. It felt good to do the work, and helped unblock the team, but it might not have been the maximum value I could have brought to the business. What was I not getting done - the opportunity cost? This is a critical lesson for this first-time manager to learn: think twice (or more) before jumping into the code. Get better at empowering than (hands-on) solving.
Maturino da Firenze. Alexander Cutting the Gordian Knot, 1510-1527. The Art Institute of Chicago.
]]>The Ed-Fi ODS/API is a REST API that support interoperability of student data systems. The API definition, via the Ed-Fi Data Standard, is extensible: many large-scale or specialized implementations add their own local use cases that are not supported out of the box by the Ed-Fi Data Standard (Extensions). Furthermore, the Data Standard receives regular updates; sometimes these are merely additive, and from time to time there are breaking changes. These factors make it impossible to create a one-size fits all client library.
But, not all is lost: the ODS/API exposes its API definition using OpenAPI, and we can use Swagger Codegen to build a client library based on the target installation’s data model / API spec. The basic process of creating a C# code library (SDK) is described in Ed-Fi documentation at Using Code Generation to Create an SDK (Note: this link is for ODS/API 7.1, but the instructions are essentially the same for all versions).
But what about Python? Yes, Swagger Codegen supports Python output. But it is
not quite enough - you also need to manage authentication on your own. And,
running Swagger Codgen requires the Java Development Kit (JDK). The notes below
will walk through generating a client library with help from Docker (no local
install of the JDK required) and demonstrate basic usage of a simple
TokenManager
class for handling authentication.
See
Ed-Fi-API-Client-Python
for a source code repository containing the TokenManager
class listed below,
which might receive updates after this post has been published.
The Swagger Codegen tool is available as a pre-built Docker
image,
at repository swaggerapi/swagger-codegen-cli
. We will use it to build a
client package for working with Ed-Fi Data Standard v5.0, which is available through
Ed-Fi ODS/API v7.1. The ODS/API Landing Page has links
to the Swagger UI-based “documentation” (UI on top of OpenAPI specification) for
all currently supported versions of the ODS/API. From there, we can find a link
to the specification
document.
The example shell commands use PowerShell, and they are easily adaptable to Bash
or another shell. The generated code will be in a new edfi-client
directory.
Note that this repository’s .gitignore
file excludes this directory from
source control, since the original intent of this repository is to provide
instructions, not a full-blown client. If you fork this repository and want to
create your own package, then you may wish to remove that line from .gitignore
so that you can keep your custom client code in your forked repository.
$url = "https://api.ed-fi.org/v7.1/api/metadata/data/v3/resources/swagger.json"
$outputDir = "./edfi-client"
New-Item -Path $outputDir -Type Directory -Force | out-null
$outputDir = (Resolve-Path $outputDir)
docker run --rm -v "$($outputDir):/local" swaggerapi/swagger-codegen-cli generate `
-i $url -l python -o /local
On my machine, this took about a minute to run. Here’s what we get as output:
> ls edfi-client
Directory: C:\source\Ed-Fi-API-Client-Python\edfi-client
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 11/27/2023 9:31 PM .swagger-codegen
d----- 11/27/2023 9:31 PM docs
d----- 11/27/2023 9:32 PM out
d----- 11/27/2023 9:31 PM swagger_client
d----- 11/27/2023 9:31 PM test
-a---- 11/27/2023 9:31 PM 786 .gitignore
-a---- 11/27/2023 9:31 PM 1030 .swagger-codegen-ignore
-a---- 11/27/2023 9:31 PM 359 .travis.yml
-a---- 11/27/2023 9:31 PM 1663 git_push.sh
-a---- 11/27/2023 9:31 PM 351139 README.md
-a---- 11/27/2023 9:31 PM 96 requirements.txt
-a---- 11/27/2023 9:31 PM 1811 setup.py
-a---- 11/27/2023 9:31 PM 69 test-requirements.txt
-a---- 11/27/2023 9:31 PM 149 tox.ini
We have code, we have tests, and even documentation. Here is a usage example from one of the auto-generated docs:
from __future__ import print_function
import time
import swagger_client
from swagger_client.rest import ApiException
from pprint import pprint
# Configure OAuth2 access token for authorization: oauth2_client_credentials
configuration = swagger_client.Configuration()
configuration.access_token = 'YOUR_ACCESS_TOKEN'
# create an instance of the API class
api_instance = swagger_client.AcademicWeeksApi(swagger_client.ApiClient(configuration))
id = 'id_example' # str | A resource identifier that uniquely identifies the resource.
if_match = 'if_match_example' # str | The ETag header value used to prevent the DELETE from removing a resource modified by another consumer. (optional)
try:
# Deletes an existing resource using the resource identifier.
api_instance.delete_academic_week_by_id(id, if_match=if_match)
except ApiException as e:
print("Exception when calling AcademicWeeksApi->delete_academic_week_by_id: %s\n" % e)
I like to use Poetry for managing Python packages
instead of Pip, Conda, Tox, etc. Converting the requirements.txt
file for use
in Poetry is quite easy with this PowerShell command (hat
tip):
Push-Location edfi-client
poetry init --name edfi-client -l Apache-2.0
@(cat requirements.txt) | %{&poetry add $_.replace(' ','')}
Pop-Location
(The default requirements.txt
file has some unexpected spaces; the replace
command above strips those out).
Note the line above with access_token = 'YOUR_ACCESS_TOKEN'
. Swagger Codegen
requires you to bring your own token generation routine. We can build one using
portions of the client library itself. The ODS/API supports the OAuth 2.0 client
credentials flow, which generates an bearer-style access token. A basic HTTP
request for authentication looks like this:
POST /v7.1/api/oauth/token HTTP/1.1
Host: api.ed-fi.org
Content-Type: application/x-www-form-urlencoded
Accept: application/json
grant_type=client_credentials&client_id=YOUR CLIENT ID&client_secret=YOUR CLIENT SECRET
There are some variations in how these parameters can be passed, but this may be the most common / universal format, and this is what we will implement here.
Generated tokens are only good for so long; they expire. When a token expires,
it would be nice if we could recognize that and automatically call for a new
one, instead of encountering an error. The generated code does not support token
refresh, and does not have an obvious hook for how to do so. For a very clean
developer experience, the authentication and refresh mechanisms would be built
into the ApiClient
class created by Swagger Codegen. But be warned: if you
rerun the generator, it will overwrite your customizations.
Someone with deeper Python expertise can probably come up with multiple ways to approach the refresh problem. This sample code handles token refresh very crudely, requiring the user of the code to detect the problem and try to re-authenticate. Perhaps a Context Manager implementation would help here.
Copy the following source code and paste it into a file called
token_manager.py
inside the edfi-client/swagger_client
directory.
import json
from datetime import datetime, timedelta
from swagger_client.rest import ApiException
from swagger_client.configuration import Configuration
from swagger_client.api_client import ApiClient
class TokenManager(object):
"""
Creates a new instance of the TokenManager.
Parameters
---------
token_url: str
The token URL for the Ed-Fi API.
configuration: Configuration
A list dictionary of configuration options for the RESTClientObject.
Must study the RESTClientObject constructor carefully to understand the
available options.
"""
def __init__(self, token_url: str, configuration: Configuration) -> None:
assert token_url is not None
assert token_url.strip() != ""
self.token_url: str = token_url
self.configuration: Configuration = configuration
self.client: ApiClient = ApiClient(self.configuration)
self.expires_at = datetime.now()
def _authenticate(self) -> None:
post_params = {
"grant_type": "client_credentials",
"client_id": self.configuration.username,
"client_secret": self.configuration.password
}
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
token_response = self.client.request("POST", self.token_url, headers=headers, post_params=post_params)
data = json.loads(token_response.data)
self.expires_at = datetime.now() + timedelta(seconds=data["expires_in"])
self.configuration.access_token = data["access_token"]
"""
Sends a token request and creates an ApiClient containing the returned access token.
Returns
------
ApiClient
an ApiClient instance that has already been authenticated.
"""
def create_authenticated_client(self) -> ApiClient:
self._authenticate()
return self.client
"""
Re-authenticates if the token has expired.
"""
def refresh(self) -> None:
if datetime.now() > self.expires_at:
self._authenticate()
else:
raise ApiException("Token is not expired; authentication failure may be a configuration problem.")
The following snippet demonstrates use the token manager with a simple token refresh mechanism. Note that this tries to delete an object that does not exist, therefore you should expect it to raise an exception with a 404 NOT FOUND message.
from swagger_client.configuration import Configuration
from swagger_client.token_manager import TokenManager
from swagger_client.api import AcademicWeeksApi
from swagger_client.rest import ApiException
BASE_URL = "https://api.ed-fi.org/v7.1/api"
config = Configuration()
config.username = "RvcohKz9zHI4"
config.password = "E1iEFusaNf81xzCxwHfbolkC"
config.host = f"{BASE_URL}/data/v3/"
config.debug = True
tm = TokenManager(f"{BASE_URL}/oauth/token", config)
api_client = tm.create_authenticated_client()
api_instance = AcademicWeeksApi(api_client)
try:
api_instance.delete_academic_week_by_id("bogus")
except ApiException as ae:
if ae.status == 401:
tm.refresh()
api_instance.delete_academic_week_by_id("bogus")
else:
raise
On a flight out to the #STATSDC2023 conference hosted by the National Center for Educational Statistics (my first time at this event), I finally wrote down my personal principles for ethical / responsible use of data and AI. Many have written about responsible use of data; there is nothing ground breaking here. Yet it feels meaningful, even if only for myself, to acknowledge “out loud” the values and principles that I wish to hold myself accountable for whenever I do use data, encourage others to make use of data, allow my own data to be used, etc.
☝Note First draft, 2023-08-10
As a person of faith, I tend to think of virtues before I think of values. Virtues are the spiritual foundations, and values are culturally-relevant extensions of those virtues. While there are countless variations on virtues and values, those below can serve as guiding lights for the principles that follow.
“Truthfulness is the foundation of all human virtues.” `Abdu’l-Bahá
Virtue | Value Correlate(s) |
---|---|
truthfulness | transparency, honesty, truth-seeking |
justice | equity, independence, consensus |
nobility | human dignity, recognition of aspirations, privacy |
Admittedly this is a hodge-podge list of principles, with no attempt to be systematic at this time.
Data Source Transparency: Retain as much information about data sources as possible. Vet them for integrity, completeness, appropriateness, and manner of collection. Only use ethically sourced information. Insure personal data usage corresponds to subject’s intent. Give credit where credit is due.
Example: avoid (shady) data aggregator / brokers.
Algorithmic Transparency: Prefer open algorithms. Look for, or conduct, audits to assess accuracy, applicability to the desired use case(s), and potential for biases. Look for explainability and ensure there is a reasonable human appeals process. Does a great deep learning model really outweigh having a good and explainable decision tree or a regression model?
Example: Do AI and cheat detection systems have low false positives? Are there clear appeals mechanisms? Have they been cross-validated against different subgroups to ensure they do not introduce biases? Can you explain why a person’s application is rejected or a promotion has been offered to one and not another when AI is involved in the process?
Side note: Does use of generative AI really matter in the given context?
Equity: Do not be content with doing no harm, rather seek to uplift marginalized perspectives and peoples. Consider institutional biases that may influence raw data (over and under representation). Consider environmental factors that may impact some populations, independent of or in concert with other demographic factors. Be mindful of accidental proxies.
Examples: Account for ethnic / racial disparities in policing, gender disparities in surveying, and geographic impacts such as longer bus rides to schools (“all the XYZ kids have poor attendance” might be due to late buses, or working parents, nothing to do with ethnicity).
Don’t feel virtuous by avoiding use of race while at the same time analyzing based on income, if income and race are highly correlated in a particular area.
Statistical Honesty: Use appropriate sample sizes and/or statistical tests to confirm assumptions. Cross validate results. Use large iterations when bootstrapping and repeat to insure stability of results.
Privacy: Practice appropriate anonymization and apply “need to know” (least privilege) restrictions. Watch out for small sample sizes that can unintentionally reveal identity. Use the least personal data available and relevant for the analysis.
Examples: Imagine an analysis that predicts a certain outcome for a target demographic, and the dataset only contains a few people from that demographic. Apply a filter by zip code. How hard is it to guess the person / household?
Can you use census block instead of street address? Zip code instead of census block? Age range instead of birth year?
Consensus and Review: Seek review of data, algorithms, and outcomes from other experts and/or affected parties. Work to achieve consensus on techniques and communication. Start early to avoid the trap of stubbornly clinging to a misdirected idea.
Destiny: as a matter of justice, of nobility, and of truthfulness: be on guard for predictions becoming destiny.
Example: Whether policing on the street or teachers disciplining students, a prediction that one group will have more criminal / behavioral incidents easily leads to paying more attention to that group, ignoring other groups, and thus detecting (or “inventing”) more ill behavior in that group - thus reinforcing the original prediction.
Speaking of giving credit where credit is due, here are some sources that have helped shape my thinking on this topic:
And articles, asides, and conversations that are too numerous to remember or cite.
Also see references in:
]]>Born into a large family on her parents’ farm in 1875 (she was the fifteenth child), she was taught early to look to the Bible for guidance and comfort, despite the family’s illiteracy. With help from a benefactress, she enrolled in school at the age of ten and eventually went on to collegiate study. Oft quoted as saying, “[t]he whole world opened up to me when I learned to read,” she went on to live an exceptional life of courage and action on behalf all people, most particularly her fellow African Americans and especially women of color.
By 1904 she was in southern Florida, where she established a school and a hospital. That school eventually developed into today’s Bethune-Cookman University. Her mission reached beyond local concerns, as reflected in her voluminous writings published in papers and journals across the country, and in her civic engagement. She served in executive capacities with the National Urban League and the NAACP, and the National Council of Negro Women. An adviser to multiple U.S. Presidents, she was apparently the only woman of color present at the founding of the United Nations in 1945.
What drove this powerful woman? What gave her the strength to face down the Klan, to push for equality of rights and dignity within the Church halls and the halls of power, and to declare to her fellow Black people in the U.S. that “we, too, are Americans,” encouraging them to stand “shoulder to shoulder with all other groups of Americans, in defending the ideals of this country”? (1) In her own words:
“Love, not hate has been the fountain of my fullness. In the streams of love that spring up within me, I have built my relationships with all mankind. When hate has been projected toward me, I have known that the person who extended it lacked spiritual understanding. I have had great pity and compassion for them. Our Heavenly Father pitieth each one of us when we fail to understand. Jesus said of those who crucified Him,
‘Father, forgive them, For they know not what they do.’
Because I have not given hate in return for hate, and because of my fellow-feeling for those who do not understand, I have been able to overcome hatred and gain the confidence and affection of people. Faith and love have been the most glorious and victorious defense in this “warfare” of life, and it has been my privilege to use them, and make them substantial advocates of my cause, as I press toward my goals whether they be spiritual or material ones.” (2)
Faith and love refilled her as she continually emptied herself, just as Shoghi Effendi guided the Bahá‘ís to do when he wrote, “We must be like the fountain or spring that is continually emptying itself of all that it has and is continually being refilled from an invisible source. To be continually giving out for the good of our fellows undeterred by fear of poverty and reliant on the unfailing bounty of the Source of all wealth and all good—this is the secret of right living.” (3)
Her statue bears the epitaph, “Invest in the human soul, who knows, it may be a diamond in the rough.” So remarkably like Bahá‘u’lláh’s pronouncement, “Regard man as a mine rich in gems of inestimable value. Education can, alone cause it to reveal its treasures, and enable mankind to benefit therefrom.”(4) Her own drive to be educated revealed her gems, it is clear, and enabled humanity to benefit therefrom.
Bethune was a devout Christian, a seminarian whose words and deeds testified to a belief in the vitality and importance of a Christ-centered life. While I have no inkling of her feelings about the Bahá‘í Faith, she was certainly aware of it (see advertisement below). To my way of thinking, her life serves as a wonderful example of what it means to live a “true Bahá‘í life”: pairing worship and service, championing the cause of the oneness of humanity, contributing to the prevalent discourses of society and to the social and economic development of the community, raising up the voices of women.
Photo by Addison Scurlock. Courtesy of Smithsonian Institution, National Museum of American History, Archives Center. Accessed courtesy of Flickr
Newspaper advertisement: The Brooklyn Daily Eagle, Brooklyn, New York. 22 Oct 1929. p 33.
Other works consulted:
Screenshot shows that I’m running Windows 10, and shows a small GUI window opened from both Powershell and from Bash using the same Python script.
Assuming you are already running Ubuntu in WSL, then the following commands will help install Python (all run from your Ubuntu/bash prompt, of course):
sudo apt update
sudo apt -y upgrade
sudo apt install python3 python3-pip
This will make the python3
command available in your path. I’m a fan of using
Poetry instead of Pip for dependency management. It can be installed in the
normal Poetry way.
I have a thing about typing python
instead of python3
, so I created an alias
in Bash: alias python=python3
. However, Poetry does not execute commands
through Bash, so the command failed with an interesting error message [Errno 2]
No such file or directory: b'/usr/share/PowerShell/python'
. Wonder why it
looked in a PowerShell
directory?
Not surprisingly, there are others who like to type one character less:
sudo apt install python-is-python3
Now the python
command works as desired, from Bash and from Poetry.
Executing a Python-based GUI app from WSL seems… a bit odd… but let’s run
with it, shall we? Because it is a requirement. We will need to use the tk
toolkit for this class. If I understand correctly, it is included in Python
3.9+. But I have 3.8. Most likely I could find a way to upgrade to 3.9, but I
don’t have a compelling reason yet, and the following command will install the
tk
support.
sudo apt install python3-tk
Next: how does WSL open a GUI window in Windows 10? You need an X-Windows
compatible server for that. There are several proprietary and open source
options available. I chose to go with the open source
VcXsrv, which I installed in
Windows (not WSL) via Chocolatey: choco install vcxsrv
.
Once installed, you need to run it via the XLaunch
command, which will be
available in the Windows start menu. This Stack Overflow
post
has good suggestions for launching it correctly. I had to read through the first
few posts to get the steps right. The application prompts you for configuration.
Key values to use:
0
export LIBGL_ALWAYS_INDIRECT=1
.The answers mention opening the Windows Defender firewall to VcXSrv. The way they do this in the Stack Overflow post might be dangerous, especially in combination with disabling access control. A potentially safer* way is to simply allows WSL2’s network interface to access your local server. That means you are not opening your firewall to the Internet. This can be done with the following command, run from PowerShell in administrative mode:
New-NetFirewallRule -DisplayName "WSL" -Direction Inbound -InterfaceAlias
"vEthernet (WSL)" -Action Allow
* I have not been in the business of writing firewall rules since the early 2000’s, so while I think this is correct, I might be mistaken. Please think through your security posture carefully before following this path.
Finally, back at the Bash prompt, you need to set the DISPLAY
environment
variable so that the X-Windows commands will be redirected to Windows. This
variable will need to access Windows through the network, addressing the Windows
computer by server name or IP address. Typically one might think of using
“localhost”. However, WSL2 runs in an isolated network inside of Windows and it
does not recognize your Windows as “localhost”. So for this command you must use
the WSL2 instance’s current IP address. Here is a convenient command that will
read your IP address into the DISPLAY
environment variable. The zero at the
end assumes that VcXSrv was configured to run on display 0
:
export DISPLAY=$(awk '/nameserver / {print $2; exit}' /etc/resolv.conf 2>/dev/null):0
Now poetry run python -m tkinter
should launch a little demonstration.
And for a more interesting demonstration, generating the windows shown in the image above:
import os
from platform import uname
from tkinter import Tk, ttk
root = Tk()
frm = ttk.Frame(root, padding=10)
frm.grid()
ttk.Label(frm, text=f"This window is running from {uname().system}").grid(column=0, row=0)
ttk.Button(frm, text="Quit", command=root.destroy).grid(column=0, row=1)
root.mainloop()
Two environment variables were created in this process. It would be tedious to come back to
this post and recopy them every time a new Ubuntu/Bash shell is opened. Linux has a simple
way of dealing with this: the .profile
file contains instructions that run every time
you open a command prompt. There is also a .bashrc
file that runs next, whenever you run
Bash (there are other shells that you could switch to, though Bash is the most popular).
Edit either one.
You will need to use a text editor such as
nano
,
vim
, or code
(if you don’t have it, typing code
will automatically start the install of Visual Studio
Code). All are excellent editors. Those who are new to Linux will probably feel more comfortable
starting up Visual Studio Code. I use it all the time. But I also use the command line frequently
when I only need to edit one file. Knowing how to use nano
or vim
is a wonderful skill
to develop. Of the two, nano
is easier to learn, and vim
is more powerful. Whichever
editor you choose, open the file like so: code ~/.profile
. The ~
instructs the operating
system to look for the file in your home directory.
Once you figure out which editor to use, just add the following two lines at the bottom of
the .profile
file:
export LIBGL_ALWAYS_INDIRECT=1
export DISPLAY=$(awk '/nameserver / {print $2; exit}' /etc/resolv.conf 2>/dev/null):0
Save that. Once saved, you can immediately invoke it, without starting a new window,
with this command: source ~/.profile
.