Looking for Focus
There’s a feeble double meaning in the title. This is about what I have in mind for some experiments with ultrasound, but also I’m doing a brain dump to get this out of my head so I can actually get on with things.
I apologise for any technical inaccuracies here, I’m still learning this stuff.
Before you find fault, I gone done an addendum.
I want to explore the feasibility of using a minimal configuration of inexpensive ultrasound transducers to capture a useful representation of objects in space, comparable to a camera image. Sonography on the cheap.
The motivation comes from medical imaging. The utility of ultrasound scanning is well-known, especially in gynaecology & obstetrics, where the health of a foetus can be examined visually. But the potential for diagnosis of respiratory diseases, notably Covid-19, has brought new impetus for research in the field.
When used in conjunction with Deep Learning-based processing there have been some very promising results delivered using standard equipment and software strategies, see in particular Deep learning for classification and localization of COVID-19 markers in point-of-care lungultrasound.
But currently a significant issue in regards to the practical application of these techniques is the simple financial cost of equipment. There are devices now appearing on the market that can provide medical ultrasound scanning at a price below €2000, but more typically they cost 10x or even 100x this amount. Add to this the cost of data processing hardware needed for running sophisticated Deep Learning algorithms, the expense quickly becomes prohibitive, even for medical professionals in wealthy nations, let alone those in regions of the world with less funding available. In the context of a global pandemic, global monitoring is essential, not to mention the human question of healthcare disparity between countries.
The underlying principle of ultrasound imaging is essentially that of sonar. A pulse of sound is applied to the medium in question and the time it takes for an echo to return gives an indication of the distance of objects in that medium. Most medical ultrasound imaging probes operate by using an array of piezo transducers to direct a beam of aligned acoustic pulses into the body tissue. The response (probably received by the same array of transducers) can be reconstituted into a 2D image of the physical structure of the tissue. The design of the actual scanning hardware seems to vary, most typically being linear or convex arrays. Detailed technical documentation is rather overwhelmed by material aimed at healthcare professionals. But one of the more recent approaches, Phased Array Ultrasonics is quite intuitive, even if some fairly complicated maths might be needed to make sense of the data.
Medical ultrasound probes operate at a frequency, ballpark region, of 5MHz. This kind of frequency is necessary to achieve a useful level of detail in images. The speed of sound in living tissue is around that of it in water, 1500m/s (compared to 340 m/s in air, rather less in a vacuum…). Differences in the density of the tissue (the physical impedance) affect reflectivity and penetration.
Skimming the literature, it seems most probes use the order of 10-100s of piezo transducers.
Downscaling for Economy
So medical probes are way too expensive for hobbiest-style research. Some other probes with similar characteristics are available far more cheaply, designed for applications around material analysis, for example detecting the depth of paint on a car body.
But ultrasound transducers designed to operate at 40kHz are outrageously cheap on the hobbiest market (say $1 per pair for Arduino-friendly distance sensors). Clearly, compared to 5MHz, their level of discrimination in something like body tissue falls well short. But change medium to air, they can be useful for object detection. (Getting one of the common HC-SR04 modules attached to an Arduino to measure distance of an obstacle is fairly trivial).
Here’s my first assumption: that the key techniques required for analysis of objects in an acoustic medium will be essentially the same between the context of different kinds of living tissue at 5MHz as objects in air at 40kHz. This I think needs no justification.
My second assumption: that a large array of transducers isn’t necessary. As long as the data generated contains orthogonal elements describing the characteristics of objects in the medium, those characteristics, features, can be extracted to provide a faithful representation of the objects. Ok, theoretically speaking, I think that’s sound [sic]. But I accept this might turn out to be a lot harder to achieve in practice.
Here’s what I’m thinking: make a 2×2 array of transducers that will act as transmitters (yellow in the rather crude diagram below), in the centre have another transducer that will act as a receiver (red). Have the spacing such that it is of a similar scale to the target objects, quite a significant number of wavelengths greater than the resonant frequency of the transducers (I’ve arbitrarily chosen 10x10cm). Close to the receiver transducer is a camera (green).
Send a kind of raster scan of impulses to the transmitters. The 4 transmitters lie in a Cartesian plane, x-y.
Start with an impulse (spike) with amplitude 1 to the transducer at top left, 0 elsewhere. Record the echoes at the receiver. Then, a moment later, send an impulse of 0.9 to top left, with simultaneous impulse at top right of 0.1, again 0 elsewhere. Then step down a row, so … err…top left gets 0.9, top right…
Errrm, this might be easier to write in pseudocode than words:
Call the transducers x0y0, x0y1, x1y0, x1y1, the amplitude of an impulse at each given by:
for x = -1 to 1 step 0.1: for y = -1 to 1 step 0.1: x0y0 = x * y x0y1 = x * (1-y) x1y0 = (1-x) * y x1y1 = (1-x) * (1-y) next y next x
Grrr. Brain fog descended, no idea if that’s quite what I’m thinking of. But hopefully you get the idea, that the impulse amplitude sent to each transducer is related to a point on a 2D area defined by those corners.
PS. I tweaked the pseudocode to change the ‘raster’ from 0…1 to -1…1.
So the system will record an audio signal of echoes for each point in the ‘grid’. Assuming the audio signal is quantized, this will provide a 3D matrix of values that is somehow related to the characteristics of the space in front of the transducers.
There will no doubt be a huge amount of scattering, noise and artifacts unrelated to the target shape. But this is very much the kind of data that convolution networks are very good at figuring out.
I need to have a play on breadboard, but the circuit should be relatively straightforward. An impulse generator with 4x independent amplitude control. My first thought on this was to use four voltage controlled amplifiers, as used in analog music synthesizers (eg. see LM13700 datasheet). Drive these with a microcontroller and quad DAC for the scanning levels. But I’m not sure this isn’t overengineering, because it only needs to control the amplitude of an impulse, not an arbitrary waveform. Need to think on…
Again, something I can probably only determine by playing on the breadboard, it might well be useful to blank the receiver for some period to cover the transmitted impulse using another VCA/analog switch to be kind to the receiver amplifier.
Two parts to this, the microcontroller to tell the transducers what to do; the processing/analysis. Recent experience has shown me that the Arduino and related families of devices are very straightforward to use for tasks like this. I’m a fan of the ESP32 in particular, has built-in WiFi, cheap, and surprisingly fast.
Oh Yeah, the Camera
Right, so, the medical ultrasound Description Logic stuff linked above is essentially a categorization task : is this Covid? How bad are this person’s lungs?
For a system as I have in mind, a crude but analogous target would be discrimination between say, a cube, cylinder and sphere in front of the ‘probe’, with any given distance, orientation etc. I don’t know yet, but I think it may be a beneficial for me to take step back from pseudo-diagnostic categorization in the first instance. Just try to get a static image using acoustic sensors that is reasonably close to the visual image from a camera.
Hence the camera.
Ok, using minimal expense hardware, talking very low definition grayscale. Compare the camera image with the output of a (relatively small) Deep Learning network.
I’ve a feeling the wiring of a suitable network could be feasible. A few convolutional layers, some small number of fully-connected layers (ie. something my laptop can do). The error function for training is pretty well covered in the literature, image similarity – think Google’s search by image.
The idea of making medical ultrasound cheaper is a ridiculously ambitious goal. But that isn’t really the point. Just having a look, trying things that have a massive probability of failure, but might still produce some potentially useful results, good motivation. Good fun too.
I’ll be putting any notes of progress up here or over on GitHub.
Thoughts & suggestions please.