PSK Automatic Propagation Reporter -- for Developers

This is a project to automatically gather reception records of PSK activity and then make those records available in near realtime to interested parties — typically the amateur who initiated the communication. The way that it works is that many amateurs will run a client that will monitor received traffic for callsigns (the pattern 'de callsign callsign' or others) and, when seen, will report this fact. This is of interest to the amateur who transmitted and they will be able to see where their signal was received. The pattern chosen is typically part of a standard CQ call. The duplicate check is to make sure that the callsign is not corrupted. This is not meant to be the exhaustive list of patterns to check. There is opportunity for innovation here. Also, this project was originally conceived for PSK use, but it is not specific to that mode. Most of the panoramic receivers have PSK decoders, which is what makes this work so well. If there was a panoramic RTTY decoder, then I would encourage submission of that data as well.

The way that this would be used is that an amateur would call CQ and could then (within a few minutes) see where his signal was received. This can be useful in determining propagation conditions or in adjusting antenna and/or radio parameters. It will also provide an archive of reception records that can be used for research purposes.

There are a number of parts to this project, as shown in the picture below. Each is dealt with separately.

Data Gathering

The data is gathered (somehow) from the client used to decode the PSK traffic. This is not standardized as it depends on the details of the client. Some clients are capable of decoding the entire audio passband simultaneously and these can provide a great deal of useful information.

The data consists of the calling callsign (at a minimum). Highly desirable extra fields are the frequency, signal to noise ratio and intermodulation distortion. Note that each callsign should be reported no more than once per five minute period. Ideally, a callsign should be reported only once per hour if it has not 'changed'. Precisely what constitutes a change is left to the discretion of the developer, but the goal is to minimize the number of database records! A change might be a move to a different band.

Authors of PSK software are encouraged to implement the data reporting protocol as described in the next section.

Data Reporting

Once the data has been gathered (and duplicate callsigns eliminated), it is formatted into UDP datagrams and transmitted to report.pskreporter.info port 4739. Note that as of Auguest 2018, sending the datagrams to pskreporter.info will no longer work. The report.pskreporter.info domain name must be used. The datagram format is IPFIX (RFC 5101). The phrases used on this page do not line up with the terms used in the RFC. This is deliberate, as this is a significant simplification of the RFC.

The datagrams should be sent at a rate of no more than one every five minutes (unless the packet becomes full). Timed sends of packets must not be synchronized to the system clock. Any flushing of packets on timers can be based on when the program started. If it is based on when the last signal was received, then please add some randomization to the five minute timer. This is to prevent a significant number of stations from reporting at the same time.

The IPFIX packet format contains (at a very high level) two parts — a number of record format descriptors that describe the format of the data part, and the data part. The record format descriptors only need be transmitted once per hour, but probably should be transmitted for the first three packets when the application starts (this is due to the lossy nature of UDP). Note that the same UDP source port number should be used to send each of the datagrams. The server saves these record format descriptors and uses them to decode packets which do not contain them.

The data part of the packet contain two sub-parts. The first is the receiver information record, and the second is the list of senders that have been detected (sender information block containing one or more sender information records).

For the technically inclined, read the next section. For the people who just want to hack together the packet sender, skip the next section, and move to the Cookie Cutter section.

Technical IPFIX Information

The attributes used for this application are:
NameAttribute IdTypeMeaning
senderCallsign30351.1stringThe callsign of the sender of the transmission.
receiverCallsign30351.2stringThe callsign of the receiver of the transmission.
senderLocator30351.3stringThe locator of the sender of the transmission.
receiverLocator30351.4stringThe locator of the receiver of the transmission.
frequency30351.5unsignedIntegerThe frequency of the transmission in Hertz.
sNR30351.6integerThe signal to noise ration of the transmission. Normally 1 byte.
iMD30351.7integerThe intermodulation distortion of the transmission. Normally 1 byte.
decoderSoftware30351.8stringThe name and version of the decoding software.
antennaInformation30351.9stringA freeform description of the receiving antenna.
mode30351.10stringThe mode of the communication. One of the ADIF values for MODE or SUBMODE.
informationSource30351.11integerIdentifies the source of the record. The bottom 2 bits have the following meaning: 1 = Automatically Extracted. 2 = From a Call Log (QSO). 3 = Other Manual Entry. The 0x80 bit indicates that this record is a test transmission. Normally 1 byte.
persistentIdentifier30351.12stringRandom string that identifies the sender. This may be used in the future as a primitive form of security.
flowStartSeconds150dateTimeSeconds (Integer)The time of the transmission (absolute seconds since 1/1/1970).

Most of the attributes are enterprise specific, and use the enterprise number 30351 which is registered to me.

Some of these attributes are used in the sender information records, and some are used in the single receiver information record. The receiver information record should appear in each datagram.

The sender information records should contain as many as possible of: senderCallsign, frequency*, iMD*, sNR*, mode, informationSource, senderLocator*, flowStartSeconds. The starred fields are optional, and the others not.

The receiver information record should contain as many as possible of: receiverCallsign, receiverLocator, decodingSoftware, anntennaInformation*.

The variable length encoding should be used for string fields. This probably gives about 16 bytes (on average) per reception record, and so a datagram can hold 80 to 90 records. This limit is unlikely to be reached in five minutes.

The record format descriptor should be transmitted at least once per hour, and n the first few packets on startup.

Cookie Cutter Information

The packet format appears to be complex, but there is a lot of boilerplate. The packet is assembled out of a set of pieces, with the actual data appended to the end.

All values are transmitted in 'network order'. This is with the most significant byte of an integer being transmitted first. Note that this not the same byte ordering as a PC. The C functions htonl/htons will do the appropriate conversions (for all platforms).

For a definition of the fields, see the attributes table above.

Header

The header contains the overall length of the packet. It also contains a sequence number and the time of transmission. All times are 'unix times' — i.e. the number of seconds since 1/1/1970 00:00 UTC. This is the value returned by the unix function time(0). The sequence number allows detection of missed and duplicate packets.
00 0A ll ll tt tt tt tt ss ss ss ss ii ii ii ii
'll ll' is the two byte length code that is the length of the entire datagram. 'tt tt tt tt' is the transmission time, and 'ss ss ss ss' is the sequence number. 'ii ii ii ii' is a random identifier that helps associate packets with UDP streams. This is needed to deal with nasty cases of residential NAT/PAT gateways and DHCP. The identifier should be constant for any particular sender (at least within a single session).

Record Format Descriptor

The record format descriptors define the precise layout of the data portion of the datagram. The descriptors do not need to appear in every datagram. They are cached by the receiving process. However, it is good practice to send them in the first three datagrams sent on application startup, and then once per hour thereafter.

There are two format descriptors, one for the sender information record, and one for the receiver information record. In each case, there are a number of different descriptors to pick — they have different fields. Pick the one that matches the data that you have available.

The two descriptors (one for the receiver information record, and the other for the sender information records) can occur in either order in the packet. Indeed they can be omitted as well once they have been transmitted a few times (to ensure that the server has cached them).

The receiver information record has two options for record format descriptors.

For receiverCallsign, receiverLocator, decodingSoftware use

00 03 00 24 99 92 00 03 00 00 
80 02 FF FF 00 00 76 8F 
80 04 FF FF 00 00 76 8F 
80 08 FF FF 00 00 76 8F 
00 00

For receiverCallsign, receiverLocator, decodingSoftware, anntennaInformation use

00 03 00 2C 99 92 00 04 00 00 
80 02 FF FF 00 00 76 8F 
80 04 FF FF 00 00 76 8F 
80 08 FF FF 00 00 76 8F 
80 09 FF FF 00 00 76 8F 
00 00

The sender information record has two options (if your software does not fit nicely into either set, then please contact me at the address below, and I will generate a record format descriptor specifically for you). All integer fields are 4 bytes unless noted. This means that the default templates only support frequencies up to ~4GHz. Changing the frequency definition to be 5 bytes gets a limit of 1THz. I.e. change the 04 to 05 in the line starting 80 05, and then send a 5 byte frequency.

For senderCallsign, frequency, mode, informationSource (1 byte), flowStartSeconds use:

00 02 00 2C 99 93 00 05 
80 01 FF FF 00 00 76 8F  
80 05 00 04 00 00 76 8F 
80 0A FF FF 00 00 76 8F 
80 0B 00 01 00 00 76 8F 
00 96 00 04 

For senderCallsign, frequency, mode, informationSource (1 byte), senderLocator, flowStartSeconds use:

00 02 00 34 99 93 00 06 
80 01 FF FF 00 00 76 8F  
80 05 00 04 00 00 76 8F 
80 0A FF FF 00 00 76 8F 
80 0B 00 01 00 00 76 8F 
80 03 FF FF 00 00 76 8F  
00 96 00 04 

For senderCallsign, frequency, sNR (1 byte), iMD (1 byte), mode, informationSource (1 byte), flowStartSeconds use:

00 02 00 3C 99 93 00 07 
80 01 FF FF 00 00 76 8F  
80 05 00 04 00 00 76 8F 
80 06 00 01 00 00 76 8F 
80 07 00 01 00 00 76 8F 
80 0A FF FF 00 00 76 8F 
80 0B 00 01 00 00 76 8F 
00 96 00 04 

For senderCallsign, frequency, sNR (1 byte), iMD (1 byte), mode, informationSource (1 byte), senderLocator, flowStartSeconds use:

00 02 00 44 99 93 00 08 
80 01 FF FF 00 00 76 8F  
80 05 00 04 00 00 76 8F 
80 06 00 01 00 00 76 8F 
80 07 00 01 00 00 76 8F 
80 0A FF FF 00 00 76 8F 
80 0B 00 01 00 00 76 8F 
80 03 FF FF 00 00 76 8F  
00 96 00 04 

Note that the two values 99 92 and 99 93 are the linking values that tie the record format descriptors to the information records themselves. These values are arbitrary two byte values in the range 256 to 65535. If there is any possibility of running two different versions of your software on the same machine and they use different record format descriptors, then it is advisable to have distinct values for different record format descriptors.

Data

The first record in the data portion of the datagram is the receiver information record. This has the following header
99 92 ll ll 
'll ll' is the two byte length code that is the length of the receiver information record, including the header and padding.

The data that follows is encoded as three (or four — the number depends on the number of fields in the record format descriptor) fields of byte length code followed by UTF-8 (use ASCII if you don't know what UTF-8 is) data. The length code is the number of bytes of data and does not include the length code itself. Each field is limited to a length code of no more than 254 bytes. Finally, the record is null padded to a multiple of 4 bytes.

For example, to encode N1DQ, FN42hn, Homebrew v5.6, the datagram fragment would look like:

99 92 00 20 
04 4E 31 44 51 
06 46 4E 34 32 68 6E 
0D 48 6F 6D 65 62 72 65 77 20 76 35 2E 36 
00 00 

The decodingSoftware values are currently shown in the statistics page, so including version numbers (and any other pertinent information) may prove useful. The antennaInformation is shown on the map display when you hover over a receiver.

After this (finally!) come the sender information records. This block has the following header

99 93 ll ll
'll ll' is the two byte length code that is the length of the all the sender information records, including the header and padding.

The data that follows is encoded as a sequence of records. Each of which contains the number of fields from the record format descriptor that you chose above. The callsign is encoded as a string using the byte length code format as described above. The frequency is a four byte integer (in network order). The iMD and sNR are single bytes (i.e. -128 to +127) and are only present if you chose that record format descriptor. The flowStartSeconds is a four byte integer (in network order) that records the time (the value of time(0)) that the callsign was recognized.

There is no padding between the records, but there is padding (with null bytes) to a multiple of four at the end.

For example, to encode (using the first record format descriptor) the following two reception records: N1DQ, 14070567, PSK, 1, 1200960084 (some time about 2008-01-22 00:00Z) and KB1MBX, 14070987, PSK, 1, 1200960104 (some time about 2008-01-22 00:00Z), the datagram fragment would look like:

99 93 00 2C 
 04 4E 31 44 51 
 00 D6 B3 27 
 03 50 53 4C
 01
 47 95 32 54 

 06 4B 42 31 4D 42 58 
 00 D6 B4 CB 
 03 50 53 4C
 01
 47 95 32 68

 00 00

An example of a complete datagram (using the sample data above), sent at 1200960114, with sequence number 1. This is, in order, the header, the record format descriptor for the sender information record, the record format descriptor for the receiver information records, the block for the single receiver information record (with padding at the end), and, finally, the block containing two sender information records (with padding at the end).

00 0A 00 AC 47 95 32 72 00 00 00 01 00 00 00 00  

00 03 00 24  99 92 00 03 00 00  80 02 FF FF 00 00 76 8F  80 04 FF FF 00 00 76 8F  80 08 FF FF 00 00 76 8F  00 00  

00 02 00 2C  99 93 00 03  80 01 FF FF 00 00 76 8F  80 05 00 04 00 00 76 8F  80 0A FF FF 00 00 76 8F  80 0B 00 01 00 00 76 8F  00 96 00 04 

99 92 00 20  04 4E 31 44 51  06 46 4E 34 32 68 6E  0D 48 6F 6D 65 62 72 65 77 20 76 35 2E 36  00 00 

99 93 00 2C  04 4E 31 44 51  00 D6 B3 27  03 50 53 4C  01  47 95 32 54
             06 4B 42 31 4D 42 58  00 D6 B4 CB  03 50 53 4C  01  47 95 32 68
             00 00

If this same data was to be sent later in the session (say with sequence number 4) after the record format descriptors had already been transmitted, then the datagram would look like:

00 0A 00 5C 47 95 32 72 00 00 00 04 00 00 00 00  

99 92 00 20  04 4E 31 44 51  06 46 4E 34 32 68 6E  0D 48 6F 6D 65 62 72 65 77 20 76 35 2E 36  00 00 

99 93 00 2C  04 4E 31 44 51  00 D6 B3 27  03 50 53 4C  01  47 95 32 54
             06 4B 42 31 4D 42 58  00 D6 B4 CB  03 50 53 4C  01  47 95 32 68
             00 00

Field Notes

informationSource

This field really must be set to the value 1 ('automatically extracted') in order for the record to be considered. If the value is 2, then this record is considered to be the result of a QSO and the record is marked specially and the map is marked accordingly.

senderLocator

The senderLocator is not normally known except sometimes in the case of a QSO where it is transmitted as part of the QSO. In this case, it is highly desirable to include it in the record. In particular, this helps locate certain stations which are not found in other online databases. It also helps locate /MM stations.

Data Storage

The datagrams are received and if the record format descriptors are not present, then the source ip/port combination is used to see if there is saved record format descriptor information. With the format descriptors (from either source), the data can be extracted. Clock skew is detected — if the time of transmission is more than one minute from the time of reception, it is assumed that the sender's clock is set incorrectly. In this case, the clock skew can be calculated and all times in the packet can be adjusted accordingly.

The sender information records are combined with the receiver information record and inserted into a database to form a table of reception records.

Data Retrieval

Reception records can be retrieved from the database by performing an http GET/POST on the URL https://retrieve.pskreporter.info/query?senderCallsign=requestedcall

This will return the last 100 reception records for the requested callsign, but for no more than 6 hours in any event. The default set of fields will be returned (receiverCallsign, receiverLocator, senderCallsign, frequency, flowStartSeconds). [Comments invited on query interface.]

The full list of query parameters is:
ParameterExplanation
senderCallsignSpecifies the sending callsign of interest.
receiverCallsignSpecifies the receiving callsign of interest.
callsignSpecifies the callsign of interest. Specify only one of these three parameters.
flowStartSecondsA negative number of seconds to indicate how much data to retreive. This cannot be more than 24 hours.
modeThe mode of the signal that was detected.
rptlimitLimit the number of records returned.
rronlyOnly return the reception report records if non zero
noactiveDo not return the active monitors if non zero
frangeA lower and upper limit of frequency. E.g. 14000000-14100000
nolocatorIf non zero, then include reception reports that do not include a locator.
callbackCauses the returned document to be javascript, and it will invoke the function named in the callback.
statisticsIncludes some statistical information
modifyIf this has the value 'grid' then the callsign are interpreted as grid squares
lastseqnoLimits search to records with a sequence number greater than or equal to this parameter. The last sequence number in the database is returned on each response.
The format of the returned information will be an XML document [is this a useful format?]. A sample XML document is:

<receptionReports>
  <receptionReport receiverCallsign="KB1MBX" receiverLocator="FN42hn" senderCallsign="N1DQ" frequency=14070987 flowStartSeconds=xxxxxxx />
  <receptionReport receiverCallsign="ZZ1ZZ" receiverLocator="GG99" senderCallsign="N1DQ" frequency=14070987 flowStartSeconds=xxxxxxx />
</receptionReports>

An example can be seen for N1DQ.

The actual properties on each receptionReport have obvious (!) names.

Users are encouraged to retrieve reception data no more often than once every five minutes. If the display of reception data is integrated into the PSK transmitting application, then the timing can be optimized — do a retrieval five minutes after each transmission of 'de callsign callsign' (provided that it is more than five minutes since the previous retrieval). The purpose of the five minute delay is to allow all the receivers to make their reports.

The /query url will point to the latest version of the query API. There is an element in the response that indicates the query endpoint being used. If you care about version changes, then you may want to pick a particular version.

Please note that I may require (in the future) that frequent users of the API enable compression on their connections, that they not request the same data frequently etc. I reserve right to block or rate limit anybody who imposes a significant load on my system. If you want to be notified, then please add an additional query parameter of 'appcontact=myemailaddress' so that I can contact you. This will also enable me to work with you to solve any problems.

Hack to update your locator

It turns out that you can use this mechanism to update your locator. This could be useful if you are mobile and want to update your locator as you move. To do this, you just need to submit a record that contains your callsign and locator as a receiver, and do not include any transmitter records.

Using examples from above, this packet (once you fill in the various ll, tt, ss, ii fields correctly as above) would set N1DQ's locator to 'FN42hn'.

00 0A ll ll tt tt tt tt ss ss ss ss ii ii ii ii

00 03 00 24 99 9F 00 03 00 00 
80 02 FF FF 00 00 76 8F 
80 04 FF FF 00 00 76 8F 
80 08 FF FF 00 00 76 8F 
00 00

99 9F ll ll

04 4E 31 44 51                              'N1DQ'
06 46 4E 34 32 68 6E                        'FN42hn'
0D 48 6F 6D 65 62 72 65 77 20 76 35 2E 36   'Homebrew v5.6'
00 00

Data Display

The data can be displayed in any way that the end software developer desires. As part of this project, there will be a simple google map based application that displays the recption data for a given callsign in a browser window. There is a map display.

Miscellaneous

Why five minutes?

I am concerned about the data rate if this takes off. There could be (say) 10,000 clients submitting data continuously, and at a 300 second interval, this comes to 30 packets per second. This will turn into a significant database load if there are any significant number of reports per datagram.

Hopefully most of the clients will implement an adaptive use of the retrieval mechanism otherwise that will prove a huge load. You may think that 10,000 is a lot of clients, but that is not clear to me. This application gives a purpose to people to just leave their rigs turned on and tuned to one of the PSK frequencies and logging away. In fact, I'd like on of the PSK clients to be able to control the frequency by time of day!

In fact there is now such a tool -- freqmode2hrd that interrogates a web page to get the best frequency for your grid square.

Why use IPFIX?

IPFIX is a protocol designed for logging data associated with network traffic. The packets are self describing, and this allows for future upgrades in the PSK reporting system without huge difficulties. Also, I prefer not to reinvent the wheel!

Notes

Callsigns are case insensitive.

The callsign to be reported is the entire string that is repeated after the 'DE' this includes any prefixes and suffixes. It is important that all authors be consistent as it allows the transmitter of the CQ to query the database for their particular string.

Testing

For testing purposes, there is a listener on pskreporter.info on port 14739 which will analyze the received packet. The results of the analysis can be seen at packet analysis. The last few received packets are displayed.
Philip Gladstone