Rewarding schemes

11. Rewarding schedules

There is no such thing as processes that evolve automatically, except for the sunrise in the east every morning and the sunset in the west every evening. All other processes need directing. Behavioural changes and learning of skills are directed by (positive or negative) reward. If you would however dish out rewards constantly there can be reached a level of over satisfaction where reward is no longer seen as a stimulus.
Besides in training a lot of time could be consumed by applying rewards. Therefore most trainers use a schedule of rewards. There are five different types of schedule: fixed interval, variable interval, fixed ratio, variable ratio and random. We will discuss them further with their pro’s and con’s.

Fixed interval
A fixed interval means that there is a reward after a set fixed time; for example every five minutes. Salary payments are a good example of fixed interval rewards

Variable interval
Variable interval means that the amount of time between rewards can vary. There could be two minutes between two rewards the one time and seven minutes another time. If receiving e-mail can be seen as a reward then an e-mail account can be considered as a system of reward with variable interval.

Fixed ratio
It means that there is a fixed ratio of non-rewarded and rewarded incidents, for instance every fourth incident is rewarded. With subs that have a low level of motivation to begin with and reward is the main motivation factor this will lead to below average results.
The first three times the behaviour tends to be suboptimal, only the fourth time – the incident that leads to reward – the sub will try harder. The fourth incident would be rewarded anyway even if the reward is not optimal. For most subs this will not function well unless the reward is 1:1; each incident is rewarded.

Variable ratio
This last problem can be solved through working with a variable ratio. Suppose the ratio is 1:4; this means that on average reward will follow once every four times, but not necessarily the fourth time. In a series of 12 rewards could happen on the first the seventh and eighth time.
We refer to it as a VSR. (Variable Schedule of Reinforcement) In BDSM training this is considered to be the most effective form; the sub will do her best every time for this could be the one that is rewarded. The longer the reward is not given the sub will even try harder for chances are more likely that the next one is the one that will be rewarded.
If you promise your sub that – on good behaviour - orgasms will follow with an average of once every three days, she will behave well every day for today could be orgasm day.

Random
Lastly we have the random schedule. In this case there is no connection between a certain behaviour and the consequences of that behaviour. In training of a sub it is very important that the sub understands why reward follows at random. If she doesn’t understand that there is the chance that she will change her behaviour in order to get a reward after all. The pressure put upon the sub is immense and it is a method that usually only succeeds with a high level of submission.

The disadvantage of random reward
Rewards initiate wanted behaviour. Random rewards have the risk of the sub trying out all kinds of behaviour of which she knows or suspects that the trainer will see that as wanted behaviour. (Throwing behaviour to the trainer) The sole reason for doing this is to increase the chances of a reward.
Even if a reward doesn’t follow: the sub will do anything just to please and be liked.

Most subs will do anything to get their Dom’s attention: dress nice, wear his favourite cologne, play nice, sweet talk, act out. Most Doms are pleased with these attentions. Once the attention is not given there is a risk for frustration and obstruction.

Extinction

We discussed the importance of automation. In most cases the rule applies that behaviour that was learned easily (minimal training effort) will be easily unlearned or forgotten. Behaviour that was trained through lots of hours of training labour will usually stick longer for the sub remembers the effort she had to make.
Trainers tend to no longer reward this behaviour, especially after automation; why should he – it is after all an automated process. At some point the sub will wonder however why she is still doing this if she gets nothing in return. The lack of reward over a long period of time can lead to extinction of behaviour.

Counterproductive
The sub sees that her efforts are no longer reward and stops the behaviour or might even show other (opposite) behaviour in order to be retrained and rewarded. Most subs will regard training as an important form of attention. In training the trainer (Dom) will give all his attention and even think of actions specifically adapted to you as his sub. Training is often seen as a reward in itself; the attention. A sub would therefore think about showing opposite behaviour in order to be retrained and get all the attention all over again.

How to protect against extinction
The reward schedule of variable ratio is the best guarantee against extinction. If you know that you are not rewarded every single time, you tend to reproduce the behaviour for the chances of reward increase with every next event. Most slot machines use this principal and create addiction with it. People know the machine will - on average – pay (reward) once in so many events. The more times I don’t win the greater the possibility that I will win on the next event.

Explosions of behaviour as a strategy of survival
Behaviour that is learned through (over a longer period) extensive training will extinct less fast but was probably often rewarded (during training) in the past. When there is never a reward now there is the chance of an explosion of behaviour. As a strategy of survival and to avoid extinction the sub will show this behaviour in a form of an explosion even if nobody asked her for that behaviour. The sub seems uncontrollable in the urge to do that and frustration will increase if this behaviour isn’t rewarded.
Although this happens more often in animal training and less with BDSM training (for one can explain why there are no longer rewards) it does happen in BDSM training foremost in the ‘animalistic’ non-verbal items. (Like assuming a position upon finger snapping from a trainer)

Example of a behaviour explosion
Ella has noticed that John – while having is cock sucked – likes it when she takes his cock deep down her throat coughing up lots of slime and saliva. If she did John would state he liked it and moaned and grunted.
When time passes John is getting used to this treatment and will moan less often. Chances are that Ella will take his cock very often and very hard deep down her throat, causing more pain than pleasure. Instead of moaning John even pushes her head back a bit which adds to the frustration and increasing her desire to take it even deeper.
At last she will – out of frustration – give up on deep throating altogether. The survival strategy of behaviour explosion has failed end it will extinct anyway.
This example shows the importance of consistency with the schedule of reward.

The unnecessary use of VSR
To round this off I do want to warn against the unnecessary use of VSR. Some behaviour will – almost automatically – lead to a (small) reward. Only when you are not in contact with your sub or don’t have the opportunity to give a reward, you really need VSR.
Suppose a sub is learned to present herself to her Dom at the start of a session. For example naked, arms folded in the small of her back, legs spread, cunt pushed out. In training this was rewarded through an extended massage of her clit.
After automation the sub finds out that she gets a (small) reward every time she presents herself like this.
Her Dom smiles, looks happy and tells her she’s a ‘good girl’; that will do as a reward. There is no need to follow a schedule in rewarding the original way.

In 24/7 or TPE relations VSR is (as a rule) not needed; in other types of relationships more often.

StarmasterX | info@starmasterx.com