The Percentile Schedule Equation: Shaping Without Guessing

Use K = M + 1(1 - W) to set shaping criteria, pick a reinforcement density, and choose your look-back window for any acquisition target, from a BCBA-led CEU.

Key takeaway

The percentile schedule formula K = M + 1(1 - W) turns shaping from a gut call into a clean piece of arithmetic, and the worked example below shows why.

Watch the full CEU recording

The Math Behind Behavior Reduction

Matt Harrington · 1 CEU · 60 min

Watch on openceu.com →

On this page · 9 sections▾

The percentile schedule formula K = M + 1(1 - W) turns shaping from a gut call into a clean piece of arithmetic, and the worked example below shows why. If you set M to 5 and W to 0.7, you plug in 5 + 1(1 - 0.7), which is 5 + 0.3, which is 5.3. You then convert that back to the rank position the next response has to beat, and K rounds to 2. That means the learner's next response only has to beat the second-lowest score in the last five trials to earn reinforcement. No spreadsheet during the session. No "does this feel like progress?" coin flip. Just a number.

This page is for the Board Certified Behavior Analyst (BCBA) who already runs shaping every day and wants to stop guessing where the bar should sit. The percentile schedule, which is a way of letting recent performance set today's reinforcement criterion, was the second half of a one-hour CEU on the math behind behavior reduction. The first half covered contingency strength. This page only covers the shaping equation. Everything below is built on the worked example from the talk.

What the percentile schedule actually does#

The percentile schedule answers one question for you, on every trial. Should I reinforce this response, or should I let it pass?

You already do this. You sit next to an RBT, you watch a kid sit at the table for 14 seconds, you remember the last three trials were 10, 12, and 15 seconds, and you make a call. The percentile schedule is the same call, written down. It takes the recent observations, asks how often you want reinforcement to land, and spits out a single number the next response has to beat.

That number is K. The two things you tune are M and W. M sets how far back you look. W sets how often you want the learner to win. Pick those two, and the math picks K for you.

K equals M plus one, one minus W. K is the response value that the next behavior must exceed. From the talk — Matt Harrington

The three variables: K, M, W#

K is the output. K is the response value the next behavior has to exceed. If K rounds to 2, the next response has to beat the second-lowest score in the look-back window. If K rounds to 4, it has to beat the fourth-lowest. K is not seconds. K is not trials. K is a rank position inside your recent window, sorted from smallest to largest.

M is one of the two inputs. M is the size of the look-back window. Three sessions, five sessions, ten trials, whatever you decide is enough recent behavior to base the next call on.

W is the other input. W is the density of reinforcement, expressed as a decimal between 0 and 1. W of 0.7 means you want the learner reinforced on roughly 70% of trials at this step. W of 0.3 means roughly 30%. W is the single most powerful dial on the page, because it controls how often the learner gets to win.

Picking M: how many sessions to look back#

M is a question about memory. How much recent behavior do you trust as a signal about where this learner is right now?

M is the number of recent observations. Basically, how far back do I look? In severe behavior, for example, you're looking, sometimes you're looking for three sessions or five sessions, to see what the behavior was, to see if we're reinforcing the next behavior. From the talk — Matt Harrington

Three is a fine starting point for behavior reduction work where sessions are short and you need the bar to track recent reality. Five is the most common choice for skill acquisition because it smooths out one bad trial without dragging in data from two weeks ago. Ten is the right call when your sessions are long, the behavior is stable, and you do not want a single high outlier to drive the criterion up.

If you raise M, the bar moves more slowly, which is good when the learner is having a rough week and you do not want the criterion to crash. If you lower M, the bar moves more quickly, which is good when the learner is climbing fast and you want the criterion to follow.

Picking W: high density vs. low density#

W is the density of reinforcement. Higher W, more reinforcement. Lower W, less.

A W of 0.1 means that there's a 10% chance that a response is going to be reinforced. A W of 0.9 means there's a 90% chance something's going to be reinforced. From the talk — Matt Harrington

High W lives in the same neighborhood as errorless learning. The learner gets reinforced often. Extinction is rare. Acquisition is slower but the response class stays calm and there is very little extinction-induced variability.

Low W lives in the same neighborhood as error correction. The learner gets reinforced less often. Extinction shows up more, which speeds up acquisition because extinction induces variability and the learner tries new things. The trade-off is that extinction is not free. If the response class includes head banging on concrete, you do not want to pay that cost.

A safe default for a new skill acquisition target is W between 0.6 and 0.8. Start at 0.7 and tune from there.

A 5-trial worked example with K = 2#

Here is the full example from the talk. M is 5. W is 0.7. You watch the learner do five baseline trials and you record the step number they completed on each one. Then you plug in the numbers.

Let's say for this example, we want 70%. So we do a 0.7. So that means that if we round up the criteria, the response has to exceed is two. From the talk — Matt Harrington

Plug it in. K = M + 1(1 - W). K = 5 + 1(1 - 0.7). K = 5 + 0.3. K = 5.3.

Now you sort the last five trials from smallest to largest. Say they completed steps 1, 1, 1, 2, and 1. Sorted, that is 1, 1, 1, 1, 2. K of 5.3 rounds to 2, and you read the bar at rank position 2 from the bottom. That value is 1. So the next response has to beat 1. A response at step 2 earns reinforcement. A response at step 1 does not.

Run the next trial. Drop the oldest score, add the newest, and re-rank. The bar moves with the learner. You never set a new criterion by hand again. You set M once, you set W once, and the percentile schedule updates the bar every trial.

When to lean errorless (high W) vs. fast acquisition (low W)#

The W choice is a risk choice.

Lean high W, 0.7 to 0.9, when the response class is dangerous, when the learner has a thin history of reinforcement and is fragile, when the staff running the program are new and small reinforcement criteria are easier to teach with fidelity, or when you are early in a new program and you want the learner to feel the contingency right away.

Lean low W, 0.2 to 0.4, when the response class is safe, when the learner has a long history with the skill and is plateauing, when you want extinction-induced variability to break a stuck pattern, or when you want acquisition to move fast and you have the clinical room to let extinction happen.

The graphs the talk walks through tell the same story. High W gives you a slow, steady climb. Low W gives you a graph with bigger jumps and more variability, often climbing faster overall, but with a wider band of behavior on the way up.

Plugging this into your spreadsheet or data app#

You do not need a custom data app. A four-column spreadsheet does the whole job. Column one is the trial number. Column two is the step the learner completed. Column three is a sorted copy of the last M trials. Column four is K, calculated as M + 1(1 - W), with a lookup that returns the rank-K value from column three so you can see the current bar at a glance.

If you are on a real data system like BDataPro or the ISCA app, the calculation can run on every response in the background. The clinician sees a simple green or red signal. Met the bar, did not meet the bar. The RBT does not need to do the math. They only need to know whether this response gets reinforced.

The point is not to make the team carry a calculator. The point is to write the rule down so the next call is the same call you would have made, just without the guessing.

Frequently asked questions#

What W value should I start with for a new skill acquisition target?

Start at 0.7 unless you have a reason to deviate. 0.7 keeps the learner reinforced on the majority of trials, which protects the response class from collapsing into extinction in the first few sessions. If the learner shows clean acquisition for a week and the graph is climbing, drop W to 0.5 to speed things up. If the learner stalls or starts engaging in the precursor behaviors you are trying to avoid, raise W to 0.8.

Is the percentile schedule the same as a percentile reinforcement schedule?

Yes. Percentile schedule, percentile reinforcement schedule, and percentile schedule of reinforcement are the same thing. The K = M + 1(1 - W) equation comes out of Galbicka's 1994 work on percentile schedules of reinforcement for shaping behavior. Different authors use slightly different notation, but the structure of the rule is the same. You pick a look-back window, you pick a reinforcement density, and the schedule returns the criterion the next response has to beat.

How is the percentile schedule different from a changing-criterion design?

A changing-criterion design is an experimental design where you, the clinician, decide when to move the bar up. You hold a criterion for some number of sessions, confirm the learner is meeting it, and then bump the criterion. The percentile schedule does that bump for you, every trial, based on the last M responses. Changing criterion is a planning tool. The percentile schedule is a session-time tool. They can stack. You can use a changing-criterion design to decide which target the learner is working on this month, and a percentile schedule to set the bar inside that target on every trial.

Want the rest of the math#

This page is one half of one CEU. The other half walks the same data through contingency strength, which is the equation that tells you whether your behavior reduction intervention is going to work before you run it. Same talk, same instructor, same hour.

Turn this topic into a CEU