Standard Service Offering

Hi everyone,

overall, I wonder how compliance to this SSO would look like—is there room to comply with SMS and payments, but not with Voice and USSD? Is it about a provider reaching a level overall and communicating that to potential partners? For NGO’s I’d probably go by use-cases: A provider certifies/complies that their API support a bunch of use-cases as defined by the SSO. That may make it easier for NGOs to decide on a partner. “High-level aggregators” like us (engageSPARK) would probably be more interested in just seeing which Core SMS services are being complied with, etc.

Re payments: Agreed, was also not sure why it’s missing, especially since airtime topup and reverse billing were mentioned. I’d probably move “airtime distribution” to the payments column for a start. And if it’s part of “payments” then it would be basic, right? (Is it core in terms of that category, or core in terms of NGO’s capabilities to run programs?)

USSD: I’m curious to hear from others how relevant that is going forward for the programs that NGOs launch.

Admin/Utility: I understand that “Message cost calculator” is something important to have, but does it need to be an API? If someone wants to launch a program, isn’t that a one-time UI tool? Similarly with “Message page count calculator”—why as an API?

Speech-to-text: What would compliance mean here? I’m wondering about the language diversity, and the capabilities to recognize something other than the most common languages in NLP toolkits (English, Spanish, Chinese, …) Looks like a can of worms. Omit?

Hosted flow/tree logic: From the subtext “(single asyncronous delivery on session close/timeout)” I read this as receiving after-the-fact notifications about what happened, in the SMS/Voice/USSD campaign, maybe via Webhook. Is this correct? If yes, where would one find behavior that allows to disburse airtime in response to completion of a survey? Is that logic part of “Reverse-billing” ?

“Specify # rings before pickup”: More minimal would be config to simply not pick up.

“automatic charset handling” is part of Core’s “String-based Two-way SMS”, but it’s also sort of a Recommended feature in “Automatic character substitution”? Maybe I’m misunderstanding?

Unfortunately, in countries where USSD is available as a service, I think it will remain relevant. It’s a terrible protocol, susceptible to lots of timeouts but universally accessible. For quick data capturing we’ve found it to work better, because the feedback loop is instant, and cheaper, because capturing the same reliably via SMS often requires multiple (costly) interactions. SMS and USSD will likely remain important to provide services to those who need it the most. Unfortunately I predict both will become increasingly expensive over time as their market share diminishes.

Couldn’t agree more. From stakeholder feedback so far, inclusion of mobile money services would be appealing to a number of larger NGOs, and has the added benefit of being a core focus area for telcos.

Typical scenarios for mobile money are Business to Customer, Customer to Business, Customer to Customer, Business to Business

Feels like Business to Customer is the most popular application e.g. humanitarian cash disbursements, conditional cash transfers for mothers, incentive payments for farmers / community health workers. However, I’m sure we can think of more use cases in the other three.

Open to suggestions @Luke_Kyohere as to how we break this down to the next level of detail!

overall, I wonder how compliance to this SSO would look like—is there room to comply with SMS and payments, but not with Voice and USSD?

you make a valid point. it may be hard to define core by channel, given there are plenty of SMS-only services (your standard bulk messaging campaigns) out there, and other voice-only services (a medical hotline) etc. Perhaps the SSO is about compliance to certain minimal capability within a channel (e.g. in SMS, the ability to do delivery reporting, among other things). In short, I would not see a problem with having a SMS-only compliant aggregator and that in some cases you would prefer to work with a specialist player that focuses on one channel (and perhaps has direct MNO connections) rather than a generalist aggregator that can deliver multiple channels but who perhaps executes through another 3rd party aggregator.

However, apart from channels, perhaps there are other services which are truly common across all M4D services (e.g. short / long code provisioning, reverse billing).

@jwishnie for SMS SSO, I’m curious what motivated for synchronous delivery reporting as a core offering? The only reliable delivery reporting mechanism I’ve seen is asynchronous since the underlying protocols themselves are asynchronous.

I would vote to move async delivery reporting to core instead of recommended, dropping sync delivery reporting entirely.

In what scenario is low level UDH needed? Since automatic character substition & handling, message splitting, and concatenation is already handled I don’t see (nor have had) a need for it.

Often our biggest problem has been verifying that a gateway is actually up, running and responsive. From a technical perspective, I would suggest this needs to be core. Working with aggregators + MNOs often means having to guess where a problem is because the health of systems is not transparent. The SSO should be able to give insight or at least some kind of signalling around backlogs and upstream throughput. A message delivery delay at the MNO is often indistinguishable from queuing problems at the aggregator level leading to lots of confusion, blame shifting, and time wasted as a result.

This was @dmccann’s proposal. I think the reasoning is that it’s generally easier to implement a synchronous API, though async delivery reporting does better match the underlying protocols.

I’m happy to drop Sync and move Async into Core.

@dmccann what do you think?

I think this would be a pretty edge case where, for example, you want to hand-build a message in a unique encoding (‘artisanal’ sms). I’d be happy to drop this unless there is an outcry,.

1 Like

@smn Can you propose a set of monitoring/health/heartbeat functions you’d like to see in core?

This makes sense to me–though I see some utility of having a multi-channel “base” or standard of some sort. Specifically in making it easy to explain and market to less technical customers.

Maybe we define ‘core, recommended, advanced’ per category (SMS, Voice, USSD, Payments etc…) and also consider defining standard ‘bundles’ e.g. “communications service bundle” ?

@muratk the motivation for an API is to enable user-level applications to provide cost/volume estimates to the end-user.

These apps can of course include their own encoding & splitting logic, but that creates the possibility that the API provider and app count messages differently creating error.

It’s a bit like the “price-the-cart” problem where it’s best to have a single code path.

For example, I’ve seen some SMS systems use a simple logic for charset encoding and CSM calculation as follows:

  • try to encode message buffer in GSM 03.38
  • If succeeds, divide number of characters by 160 to determine number of SMSs
  • If fails, divide number of chars by 70 to determine number of SMSs

But a more sophisticated system would encode each individual message in a CSM as efficiently as possible, so that if a non-GSM char appeared once, say in the first 70 chars, the first message would be Unicode encoded, but the rest of the buffer would go as GSM.

If the underlying service provider uses one approach and the app another, counts will be off.

@muratk I think you caught a typo here. I believe we meant text-to-speech, not speech-to-text. @dmccann can you confirm?

Of course your question on languages still applies.

My gut would be to leave the specific languages out of any proposed standard, and maybe simply say something like “Text-to-Speech for 1+ languages”

Of course any provider, when describing their own offering, would want to list the languages they support.

@muratk – if the most basic APIs for handling SMS/USSD/Voice conversations is some kind of callback system (like web-hooks), this would be the next level of hosted functionality where a provider can run an entire interaction without needing to call webhooks–perhaps by implementing the RapidPro Flow specification .

The idea being that clients of the API could provide in advance any necessary assets (e.g. audio files for IVR) and rules such that the provider can run the interaction and provide the results to the client.

Honestly, this might never make sense for an API provider to offer. It may be a feature that makes most sense for an app provider like engageSpark to handle at the user level.

Hence ‘advanced’

I would worry that putting this in core would have strange unintended consequences. I’ve seen aggregators implement fake (and immediate) MT delivery reports for MNOs that didn’t supply delivery reports because that’s what a contract stipulated.

I’m struggling to define a set of criteria that’s comprehensive enough to be useful as what is possible depends on the underlying protocols used. Here’s a very incomplete attempt:

  1. For protocols that implement an underlying connectivity signal, expose an API that provides a view of the last n signals received and map those to standardized indicators.

Example: SMPPs enquire_link PDU is sent at regular intervals. An API should be able to map this to a standardised indicator that represents an established connection.

  1. For protocols that implement a stateful connection, expose an API that maps the state of the connection to a standardised indicator.

Example: An SMPP connection being bound in rx, tx, or trx mode can be mapped to indicators that indicate that the authentication has succeeded. Absence of it means a disconnection.

  1. For systems that relying on internal queing, expose an API that gives insight into queue sizes and growth / drain rates. Junebug provides some support for this.

  2. For stateless connections (which most USSD protocols are) provide an API that the succes rate of n most recent USSD replies (if the protocol is asynchronous) and the n most recent USSD requests received.

I think my “app developer” bias was probably showing here. I don’t have a strong preference as to which (sync or async) lives where. My concern would more be that at least some form of delivery reporting is available in core.

My reasoning here was to assume that the application developer:

  1. Wants to be able to collect user input via speech or DTMF
  2. Lacks the sophistication to have either voice synthesis or recognition built into their app (might be dubious, that’s why this is only in recommended rather than core).

Since there’s a low-tech workaround for text-to-speech (pre-recorded humans), I felt like only speech-to-text needed to be here.

What do people think about the above, especially #2?

I think text-to-speech is a basic feature of all the headless-IVR systems. It’s easy to implement and useful. I wouldn’t leave it out.

Speech-to-text is pretty advanced stuff, and does have a workaround–record the speech and send the file to a transcription (machine or human) service :wink:

If we keep speech-to-text, I’d argue for moving it to ‘advanced’

I’ll adjust based on @smn’s suggestion

1 Like

Provided the response isn’t being used in your branching logic, of course! :stuck_out_tongue: Fine with moving to advanced.

Do you think it’s possible/reasonable to have a higher-level ‘monitoring’ API that provides a simplified indicator, maybe per-service or channel, of whether the service provider considers it ‘operational’ or not?

I’m thinking something like Twilio’s dashboard (and the underlying API that feeds it):
https://status.twilio.com/