Cp-322 /
4/5/2012 /
Isaac Cohen
Fabian Michalczewski
Steven Peters
Ron Roscoe Sabale
I Pledge my honor that I have abided by the Stevens Honor System
Transparent Box Functional Diagram
The figure below, named the Transparent Box functional diagram, displays all the components that are needed to create the Pitch Detection App. The inputs are either an mp3 file in the Android library or music that is being recorded on the microphone of the Android device. As can be seen, when using the microphone we also pick up background noise with the desired to music. To improve the pitch detection, the music signal will first be put through a noise filter to filter out all of the undesired noise. It will then convert the file to the frequency domain where it will do the analysis to detect the pitch. After the pitches have been analyzed through the algorithm, the app can then display the information in a couple of different ways. It could simply display the notes, play the notes, save the notes to a file for future use, or even display the notes being played in a piano. This gives the user a good interface and way to display the information.
Figure 1- Transparent Box Functional Diagram
Function-Means Tree Diagram
Upon the detection of a sound or music file our application will need to perform one of a multitude of possible techniques in order to detect the pitch changes present in the music and then present those pitch changes as variations in notes being played. This general approach should convert the music file into sheet music representative of that music. There are many techniques available that can be used to perform the first step of detecting and representing pitches as a form of data which can then be further converted into notes and chords on the musical scale. Depending on which algorithm is used we will obtain different results with various levels of accuracy but by analyzing the similarities and differences obtained we can arrive at a signal translation of maximum correctness.
Figure 2- Function-means tree diagram
There are two primary ways we can generate the data we need to obtain from the music that is being played. The first way is through time domain analysis through the use of the auto correlation algorithm. The major goal of this type of analysis is the identification of the fundamental frequencies of a signal by detecting changes in polarity of the input signal, either from positive to negative or negative to positive. It is at these transition points that locations for fundamental frequencies. This is done by taking the signal and shifting it back and forward in time making use of the assumption that periodic signals show relative similarities to signals of adjacent periods. By passing a signal through the autocorrelation function and identifying the minimum values of said function provide an initial list of frequency options a number further reduced by finding polarity transition points. This technique works well at small frequency ranges because we need to capture a period of at least double that of the incoming signal. In terms of our project this technique provides high accuracy readings especially at music segments with violent swings in pitch but becomes unfeasible to use as a primary detector because it requires a multitude of calculations but can work for small ranges.
The other way we can analyze the music is through frequency domain analysis with which we can employ one or more of three techniques. These three techniques are the Maximum likelihood analysis, the Harmonic Product Spectrum and the Hybrid Cepstrum analysis. Each of these signals converts the music signal from the time to frequency domain and that is where they are implemented in order to produce an end result. For the maximum likelihood analysis the input signal is matched to a set of idealized spectra in order to find the closest match between the input signal and the idealized case. This analysis is limited by cases where a signal falls in the middle of two pitches which can cause problems in identification. Also octaves outside the pitch ranges will produce more errors. These errors we will attempt to cover up with other methods.
Another technique we can use is the Harmonic Product Spectrum (HPS) which takes the spectrum and starts by compressing it using down sampling. This isolates the fundamental frequencies of the signal by eliminating the same frequencies at higher orders by fusing them in the down sampling. These down sampled signals are then multiplied together to create one fundamental frequency with relative ease. However this technique works poorly at low frequencies. Lastly we can use Cepstrum analysis for our data. The first part of the analysis is to calculate the Cepstrum by taking the DFT of the signal and examining it for a limited range of frequency values corresponding to the period of the sample. It is then normalized and using a probability algorithm and dynamic programming (based on a variety of factors) to determine the pitch with the highest probability of being the one you are listening to. This works really well at low frequencies and has been tested to be effective for frequencies that encompass speech.
While all of these techniques have drawbacks each has a particular strength that may prove to be a valuable asset in our endeavor. Each of the analyses weaknesses can be covered up by the strength of another and the fusion of all of the techniques into one program we feel can produce results of the highest clarity with minimum errors and maximum frequency flexibility. While each will produce a different pattern the patterns can be cleaned up and joined together to get as close we can to getting good results.
Android Architecture Diagram
The diagram below shows the major components of the Android operating system. Each section is described in more detail below.
Figure 3- Android Architecture Diagram
Applications:
The top level is applications and refers to the core applications on an android device. Applications include calendar, email client, maps ,browser, contacts, phone functions among others. All applications are written in the Java programming language.
Application Framework:
Androids is an open development platform and thanks to this developers are able to build highly customized apps for whatever their needs are. Developers have access to a full range of framework APIs used by the core applications. Underlying all applications is a set of services and systems, including:
- A rich and extensible set of Views that can be used to build an application, including lists, grids, text boxes, buttons, and even an embeddable web browser
- Content Providers that enable applications to access data from other applications (such as Contacts), or to share their own data
- A Resource Manager, providing access to non-code resources such as localized strings, graphics, and layout files
- A Notification Manager that enables all applications to display custom alerts in the status bar
- An Activity Manager that manages the lifecycle of applications and provides a common navigation backstack
These services and system are what enable Android applications to functions and will be utilized during the development of our app.
Libraries:
Android includes a set of C/C++ libraries used by various components of the Android system. These capabilities are exposed to developers through the Android application framework. Some of the core libraries are listed below:
- System C library - a BSD-derived implementation of the standard C system library (libc), tuned for embedded Linux-based devices
- Media Libraries - based on PacketVideo'sOpenCORE; the libraries support playback and recording of many popular audio and video formats, as well as static image files, including MPEG4, H.264, MP3, AAC, AMR, JPG, and PNG
- Surface Manager - manages access to the display subsystem and seamlessly composites 2D and 3D graphic layers from multiple applications
- LibWebCore - a modern web browser engine which powers both the Android browser and an embeddable web view
- SGL - the underlying 2D graphics engine
- 3D libraries - an implementation based on OpenGL ES 1.0 APIs; the libraries use either hardware 3D acceleration (where available) or the included, highly optimized 3D software rasterizer
- FreeType - bitmap and vector font rendering
- SQLite - a powerful and lightweight relational database engine available to all applications
Android Runtime:
Android includes a set of core libraries that provides most of the functionality available in the core libraries of the Java programming language. Every Android application runs in its own process, with its own instance of the Dalvik virtual machine. Dalvik has been written so that a device can run multiple VMs efficiently. The Dalvik VM executes files in the Dalvik Executable (.dex) format which is optimized for minimal memory footprint. The VM is register-based, and runs classes compiled by a Java language compiler that have been transformed into the .dex format by the included "dx" tool.
The Dalvik VM relies on the Linux kernel for underlying functionality such as threading and low-level memory management.
Linux Kernel:
Android relies on Linux version 2.6 for core system services such as security, memory management, process management, network stack, and driver model. The kernel also acts as an abstraction layer between the hardware and the rest of the software stack.
Source: