The time has come, and I have been politely asked to remove my math questions from the Big Programming Thread.
So why not have a math thread? Hopefully it isn't just me by myself in here, but I would think that there should be as much, if not more interest in maths than in programming anyways.
So what is welcome here? Anything math related! Random questions, theory, homework discussion, anything. But we must have proper etiquette (as arbitrarily determined by me with the following Dos and Don'ts.
Do: Be respectful - don't be condescending or make fun of people. Not everyone is at the same level and we should all be here to learn or teach. There will be zero tolerance for excessive rudeness or trolling.
Use google - If you are looking for help understanding or working out a problem, do a simple search to see if there are sufficient resources out there for you already. If it's a problem with many resources but you still don't understand, explain why.
Accurately explain your problem - if people are expected to put time into helping to explain a problem, then you should be putting the time in to accurately explain the problem so they can understand it.
Make your work readable. If you upload a scan/pdf, we need to be able to read the whole thing. If the equation is too complicated (symbols, sums, etc) to simply type out and you don't have a scanner, you can use latex online. There is a free editor: here.
Do not: Use this thread to cheat on homework. It's for learning. You don't learn by cheating. I don't think it's reasonably enforceable to try to stop people from cheating, but if people think you are only here for solutions then you will probably be shamed for it.
Take contributors for granted. Don't spam your questions, or get upset if you don't get answered. If you received help, maybe try to help others if you can. Or at least continue to be respectful.
Go off topic. If it's loosely related to math you can probably get away with it. If it's completely off topic you may lose privileges to post in here.
I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.
Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!
Edit: Ended up figuring it all out! Posted followup on 6th page of this thread here.
i took a 3rd year math course in "applied combinatorics". 4 month course. had to take it to complete my degree. the grade was based on 2 mid-terms at 25% each and the final exam at 50%.
i got the 3rd highest mark in a class of ~35 in the 1st mid-term with 50/100. 1 guy got 66/100 and another guy got perfect, 100/100. obviously the class we get our grades back we're a little devastated.... lots of moaning whining and complaining... prof speaks up.... and says...
"if i flip a coin 35 times and it only comes up heads 2 times i do not statistically adjust that result. i note it as a special and unusual case". what he was saying was "there could well be 33 morons in this room ... so shut up you idiots"
a bunch of people dropped the course without academic penalty immediately after receiving their grade in the 1st mid-term. There were 20 people left in the class. it was fucking brutal. toughest math course i ever took. i spent 10 amazing fun filled weekends with my applied combinatorics textbook that semester. i still remember half the content of the course and i took it more than 8 years ago.
@JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
Any program such as Mathematica or Matlab will give you answer ezpz. Why bother calculating it by hand?
It would take a very long time to properly type in the right order so it is probably safer to quickly check if it's solvable (by looking for forbidden operations or whatever), and if it is indeed a lot of software could do it.
With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.
Seems trivial, actually. Just a load of simplifications and then arithmetic. But things like the sqrt of a cubed number are simply the absolute value of that number. I looked it over and don't see an obvious multiplication by 0, so don't feel like doing the work. But it's not difficult as such.
Seems trivial, actually. Just a load of simplifications and then arithmetic. But things like the sqrt of a cubed number are simply the absolute value of that number. I looked it over and don't see an obvious multiplication by 0, so don't feel like doing the work. But it's not difficult as such.
With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.
Any program such as Mathematica or Matlab will give you answer ezpz. Why bother calculating it by hand?
It would take a very long time to properly type in the right order so it is probably safer to quickly check if it's solvable (by looking for forbidden operations or whatever), and if it is indeed a lot of software could do it.
With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.
Lot of procedures ? 1/2 hours ?
Probably about 3-5 minutes if you have a calculator and know what you're doing.
With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.
Lot of procedures ? 1/2 hours ?
You can just use a calculator. Don't be scared by complex looking things. Take a step back and look at what is actually written. In the end it's just a sum and multiplication of a few terms.
Now all you have to do is calculate A till E first and then use the sum for the final answer.
I updated the OP. If you upload scans, please make sure they are readable.
If your problems are too hard to simply type and you do not have a scanner, please use latex. An easy solution is an online latex editor like this: https://www.codecogs.com/latex/eqneditor.php
On June 09 2017 20:12 travis wrote: I updated the OP. If you upload scans, please make sure they are readable.
If your problems are too hard to simply type and you do not have a scanner, please use latex. An easy solution is an online latex editor like this: https://www.codecogs.com/latex/eqneditor.php
On June 09 2017 23:21 Nin54545 wrote: Thank you for the A Manit0u , i am not so confident in my ability , i will study equations in a few years...
Then it is probably too early
In this specific case, sin(30), cos(60) and log100(10) are just clever ways of writing 1/2, most sqrt() also disappear. Pretty much everything simplifies nicely ; it looks basically like an equation to write something simple in a complicated way.
The weird ones are - 14.661 and 21584, which seem arbitrary. Probably aimed at obtaining a given number as a result, but not an elegant way to do it - D just looks wrong. Probably 512 and not 5*sqrt(2), but even then it should probably read (512*0.5)² instead of 512*(0.5)² (and then D=8) - sqrt(1/2.sqrt(16)) in C looks strange, it leaves a sqrt(2) in the equation which is awkward when everything else is just fractions.
On June 09 2017 04:13 CecilSunkure wrote: I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.
Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!
Doing some biophysical modeling and analyzing the (possible) oscillations in noise, I did write some FT code. It's a bit different than pitch adjustments. I wanted some power spectrum of time series data. I used the FFTW C library, which seems the fastest thing you can get to do FT, unless you use a something optimized for a specific architecture/chipset/hardware. It is reasonable straightforward to use, and you can call it up from any language; C/Python/matlab/Julia.
It is very much a black box, and the library is so fast because it divides the problem into chunks and uses a mix of several numerical methods, depending on the nature of the problem and the hardware you are running. It is completely opaque. But since it is an industry standard, that's ok.
To me the signal processing element of it all was a bit of of a dark art. You need to be an electrical engineer specialized in signal processing to really know how to decide on the parameters you want to use to most effectively convert time series data into frequency series data. Windowing, sampling, spectral leakage, aliases, frequency resolution, and some version of the Heisenberg uncertainty saying you an increase of frequency resolution would inadvertently decrease time resolution, and all kinds of artifacts that might pop up, that all was not very easy to understand 'on the fly'. I still remember the 'convolution in the time domain corresponds to multiplication in the frequency domain', but if I had to explain it right now, I'd fail. In the end, I am a chemist by training, working with mathematicians turned biologists. And signal processing using FT is a big thing in engineering, and scientists just use it as a black box, most of the time.
The discreteness also doesn't help, as the continuous math is 'simple' the understand, as long as you are comfortable with the complex plane. But the implications of discreteness, they made it all a bit more confusing. Especially since I never took a course in discrete maths. And I has on a deadline to just get it working. So I didn't have the time to patiently go through a signal processing textbook and try out simple things step by step.
That said, for what you are doing. If you transform some sound in the time domain to the frequency domain, you can hit it with some function. Then the frequencies in your signal will chance. When you then convert it back to the time domain, it will be a different sound, as it contains different harmonics/overtones. I guess this is how autotune works, in a way.
For the math, I thought this video was best:
In the end it is all about projecting the time data onto the complex plane. That's why it uses the sine and cosine.
As for applications, it is used all over the place. It is probably one of the most commonly used algorithms around. Everyone with some electronic device, phone, mp3 player, etc, uses it all the time. Sounds, spectrum, analysis/recording/sampling of data, but also data compression. As a scientist, we usually use it when we record a spectrum of a molecule. Instead of getting how much photons it absorbs at each time, we get a fingerprint of which frequencies it absorbs in general. It removes noise, compacts what is happening over a longer period of time, and shows all the info we want to know in a straightforward manner.
On June 09 2017 04:13 CecilSunkure wrote: I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.
Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!
He has some nice playlists explaining stuff about the fourier transform, discrete fourier transform, Z domain, sampling, zero padding, etc etc... all the things you will need if you are working with sound.
Wish I could help you more, but I always understood Fourier Transforms in a very superficial way. If you really want to understand you will really need to go deep into the math, but there are some very good ways to visualize these transforms which help the extremely abstract math.
On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).
Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)
I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.
On June 09 2017 23:21 Nin54545 wrote: Thank you for the A Manit0u , i am not so confident in my ability , i will study equations in a few years...
Then it is probably too early
In this specific case, sin(30), cos(60) and log100(10) are just clever ways of writing 1/2, most sqrt() also disappear. Pretty much everything simplifies nicely ; it looks basically like an equation to write something simple in a complicated way.
The weird ones are - 14.661 and 21584, which seem arbitrary. Probably aimed at obtaining a given number as a result, but not an elegant way to do it - D just looks wrong. Probably 512 and not 5*sqrt(2), but even then it should probably read (512*0.5)² instead of 512*(0.5)² (and then D=8) - sqrt(1/2.sqrt(16)) in C looks strange, it leaves a sqrt(2) in the equation which is awkward when everything else is just fractions.
On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).
Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)
I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.
But my question is: can you know the real probability of the event?
On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).
Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)
I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.
But my question is: can you know the real probability of the event?
With finite amount of data you will never be able to learn the "real" probability of an event. What you obtain using statistical methods is always an estimate. If you read scientific articles that use statistical analysis they will always report both point-estimates as well as standard deviations, confidence intervals or posterior belief distribution to measure how "precise" is their reported point-estimate.
On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).
Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)
I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.
But my question is: can you know the real probability of the event?
Supposed we knew everything about how everything works, then yes, in that case we would be able to know the real probability (which would be either 0 or 1).
But at the end of the day we know very little about how everything works and we may not even truly realize how little we know.
Trying to predict something like an election or the weather comes down to simplified models over the factors we think influence the outcome - then the models can be reevaluated; how the factors are weighed can be adjusted etc
Yeah probability is a way to estimate the likelihood of something happening without complete information. If we had precise knowledge of how you flip the coin - initial starting position, angle of the flick, force of the flick, exact dimensions of the coin and its weight, wind conditions, and who knows how many other parameters, the probability of heads is not 1/2. You'd be able to build a physical model to know the exact answer. It'd be 0 or 1. At that point its purely deterministic.
On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).
Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)
I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.
But my question is: can you know the real probability of the event?
In science (ie: not mathematics), nothing is absolute. Even the most mundane and predictable of events; we cannot know anything for 100%. For example, when I throw a dice, we cannot know that there is a zero chance of the dice not shattering on impact. Yes, we can calculate the forces involved. But who is to say that for the first time ever, the laws of nature won't suddenly change?
In the same way, if a dice truly gives a 50/50 for odds vs even, you only get to exactly 50/50 going to infinity. In fact, it is completely impossible to get 50/50 if you throw a dice an odd number of times. The reason we know a dice is 50/50 to be odd or even if because we do exactly know the number of sides it has. The assumption then is that the dice if perfectly fair. Which probably isn't the case. If you are really bored, you can take a bunch of dice, or coins, throw/flip each of them an absurd number of times. Then calculate how likely the outcome you got is under the assumption that the dice/coin is fair.
In principle, any imperfection of flaw will mean the dice/coin isn't perfectly symmetrical. And thus it can in principle lead to a biased die. So in the case of an unfair dice/coin, there is no way to know with 100% accuracy that the exact probabilities are. The law of large numbers can get you far enough. Far enough for any real-world applications. (now if you really need to throw a lot of times, the dice/coin may actually wear and it's fairness may chance as a function of the number of throws, adding another layer of complexity.) So it is mainly a philosophical debate.
That said, you probably control the outcome of the dice/coin for a 100% by 'deciding' how you throw it. But I don't know of any magicians that have enough skill to throw a dice in a certain way so it gets them the outcome they want.
On June 10 2017 07:43 fishjie wrote: Yeah probability is a way to estimate the likelihood of something happening without complete information. If we had precise knowledge of how you flip the coin - initial starting position, angle of the flick, force of the flick, exact dimensions of the coin and its weight, wind conditions, and who knows how many other parameters, the probability of heads is not 1/2. You'd be able to build a physical model to know the exact answer. It'd be 0 or 1. At that point its purely deterministic.
And even then it would be an exact answer under the assumption that the physical model is completely accurate.
On June 10 2017 07:45 Ernaine wrote: In science (ie: not mathematics), nothing is absolute. Even the most mundane and predictable of events; we cannot know anything for 100%.
For example, when I throw a dice, we cannot know that there is a zero chance of the dice not shattering on impact. Yes, we can calculate the forces involved. But who is to say that for the first time ever, the laws of nature won't suddenly change?
Even if a magician appeared before us and rolled 1 a billion times in a row - that wouldnt confirm anything other than rolling 1 being a possibility. Now, assuming uniform distribution(equal likelihood of each occurance - 1/6 of rolling 1), what the magician did occurs rarely - one sixth to the billion - so for real world and every day applications, we would argue the magician seems to be able to manipulate the dice such that the distribution is not uniform.
On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.
I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!
Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?
(think about FiveThirtyEight and the likes for context)
edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.
For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).
Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)
I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.
But my question is: can you know the real probability of the event?
In science (ie: not mathematics), nothing is absolute. Even the most mundane and predictable of events; we cannot know anything for 100%. For example, when I throw a dice, we cannot know that there is a zero chance of the dice not shattering on impact. Yes, we can calculate the forces involved. But who is to say that for the first time ever, the laws of nature won't suddenly change?
In the same way, if a dice truly gives a 50/50 for odds vs even, you only get to exactly 50/50 going to infinity. In fact, it is completely impossible to get 50/50 if you throw a dice an odd number of times. The reason we know a dice is 50/50 to be odd or even if because we do exactly know the number of sides it has. The assumption then is that the dice if perfectly fair. Which probably isn't the case. If you are really bored, you can take a bunch of dice, or coins, throw/flip each of them an absurd number of times. Then calculate how likely the outcome you got is under the assumption that the dice/coin is fair.
In principle, any imperfection of flaw will mean the dice/coin isn't perfectly symmetrical. And thus it can in principle lead to a biased die. So in the case of an unfair dice/coin, there is no way to know with 100% accuracy that the exact probabilities are. The law of large numbers can get you far enough. Far enough for any real-world applications. (now if you really need to throw a lot of times, the dice/coin may actually wear and it's fairness may chance as a function of the number of throws, adding another layer of complexity.) So it is mainly a philosophical debate.
That said, you probably control the outcome of the dice/coin for a 100% by 'deciding' how you throw it. But I don't know of any magicians that have enough skill to throw a dice in a certain way so it gets them the outcome they want.
You don't only get to exactly 50/50 going to infinity, you have 50% chances of having 50/50 if you throw it 2 times. And I know the definition of probability with infinity and such, that's not really my question. About the dice example, it has less chances of giving 6 than 1 afaik because there are less holes in 1, but again it's not my question :/.
And I'm not talking about dices at all because dices are pretty well random.
What I'm talking about is the probability of real life events that are a priori not random: are we still stuck at determinism issues?
Well, I don't agree. Yes, dices may not be fair. No real physical dice can be infinitely fair.
But if we throw only 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% heads and 90% tails is still somewhat in agreement with the data. And that is quite a bit different from the 50/50
Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.
So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice
On June 10 2017 08:24 Ernaine wrote: Well, I don't agree. Yes, dices may not be fair. No real phyiscal dice can be infinitely fair.
If we throw 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% coins and 90% heads still fits the data. Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.
So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice
? That's not what I am saying. I'm saying that you can have the same number of evens as the number of odds before infinity, not that you can be sure of the probability before going to infinity, which is the definition of probability so it's true by definition :o.
On June 09 2017 07:03 JimmyJRaynor wrote: lots of moaning whining and complaining... prof speaks up.... and says...
"if i flip a coin 35 times and it only comes up heads 2 times i do not statistically adjust that result. i note it as a special and unusual case". what he was saying was "there could well be 33 morons in this room ... so shut up you idiots"
It's even more devastating, when you take into account, that a coin actually has 3 sides. Seems almost like that one lucky bastard, who got 100/100 tossed a coin and it came up on the 3rd side.
On June 10 2017 08:24 Ernaine wrote: Well, I don't agree. Yes, dices may not be fair. No real phyiscal dice can be infinitely fair.
If we throw 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% coins and 90% heads still fits the data. Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.
So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice
? That's not what I am saying. I'm saying that you can have the same number of evens as the number of odds before infinity, not that you can be sure of the probability before going to infinity, which is the definition of probability so it's true by definition :o.
Either what you are saying is wrong, or it is besides the point. Whatever the true (secret?) probability the coin is, there is always a possibility that after an even number of flips, you have exactly the same number of odds and evens. And the larger the number of throws gets, the smaller those odds are. Even in the case of a perfectly fair coin. Only towards infinity will the deviations from 50/50 slowly disappear.
If you refuse to take my word for it, try it out yourself.
On June 10 2017 08:24 Ernaine wrote: Well, I don't agree. Yes, dices may not be fair. No real phyiscal dice can be infinitely fair.
If we throw 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% coins and 90% heads still fits the data. Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.
So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice
? That's not what I am saying. I'm saying that you can have the same number of evens as the number of odds before infinity, not that you can be sure of the probability before going to infinity, which is the definition of probability so it's true by definition :o.
Either what you are saying is wrong, or it is besides the point. Whatever the true (secret?) probability the coin is, there is always a possibility that after an even number of flips, you have exactly the sane number of odds and evens. And the larger the number of throws gets, the smaller those odds are. Even in the case of a perfectly fair coin. Only towards infinity will the deviations from 50/50 slowly disappear.
If you refuse to take my word for it, try it out yourself.
? I'm just saying that if you throw it 2 times you can have an even and a odd, how is that wrong? There is no point because you didn't answer my question and went on about the definition of probability, which I already know :x. edit: oh nevermind you are the troll from the other time, that's why you talk non sense all the time xD.
I talk nonsense because I have a PhD. That's what we do. Troll. Sorry bro. These things are no longer a matter of opinion.
The 'troll from the other time'? At least my account doesn't have a troll name.
[edit]
Ooh, I see now what you mean. You refer to the Go-Deepmind thread, where you casually called people idiots for not agreeing with your trivially wrong position. I called you out, and you didn't like that. How do people like you get 7000 posts on TL? Amazing.
If you throw a coin twice, and it lands 'heads' once and 'tails' once, you know absolutely nothing.
I am one of the few people here able to make quality posts. Why personally attack me for no reason? We have a guy here saying it is fair to believe a coin is 50/50 after throwing it twice. And he calls me a troll. What do you want me to do? Yes, I know not everyone has a sophisticated mind. But that is ok, as long as you can accept that. I am looking at the guy above me.
Honestly, I think there's a lot of conversational equivocation going on right now between the law of large numbers in regards to theoretical probability and then the empirical, actual results obtained along the way. People are making very valid points about probabilistic models but are also talking past each other and getting hung up on semantics.
On June 10 2017 09:03 Ernaine wrote: If you throw a coin twice, and it lands 'heads' once and 'tails' once, you know absolutely nothing.
Well, absolutely nothing is a stretch ... you know that the probability of getting heads at the time you made the first throw and in the conditions of that event were non zero.
If you make the assumption that the outcome of a coin toss is multinomial distribution fixed over time, you can say that the list of outcomes contains the 'heads' and 'tails' events with a non null probability.
In reality the roll of dice or throw of a coin is perfectly deterministic. Sure we cannot calculate this (most of the time) given that there are multiple forces at play and their values/vetors are extremely hard to measure accuratly. Still such rolls are not really random in atrong sense.
That's the thing. Statistics, for a lot of things, are used for things that are deterministic in nature, but have too many parameters tied to them that it's almost impossible to obtain that deterministic outcome. Of course very educated guesses can be made using it, but it's rather sad that we -still- just can't perfectly model some deterministic events imo, or it's more of the fact that we have statistics and we're too comfy with going for an "one of the other" situation.
Edit: You also might wanna reevaluate your point about quantum mechanics, "every quantum state can be represented as a sum of two or more other distinct states" and "the result will be another valid quantum state". (Source: Quantum Superposition)
So I've spent a day studying Fourier Transform. So far the math is very understandable, at least on how to apply it. Understanding why it works and specifically what it does is more challenging. Most resources on the subject are just absolutely awful, and I may have to go out and buy and old text book to get good information.
It seems like FFT implementations popped up around the 80s-90s, and nobody ever really looked at them again in depth. Since then everyone takes and old implementation and wraps it. Well there's FFTW, but that library is a software engineering nightmare. It's a horribly obfuscated and heavyweight solution for a particularly small function. But what can you expect from a bunch of academics.
The real challenge is to be able to derive all the math and understand it in all its correctness. Only this way can I write a good implementation.
So, any book recommendations from anyone that really understands the math?
Edit: You also might wanna reevaluate your point about quantum mechanics, "every quantum state can be represented as a sum of two or more other distinct states" and "the result will be another valid quantum state". (Source: Quantum Superposition)
I'm currently a web dev, who has moved into analytics, data science, and usability. My next stop in my career is eventually aerospace. I'm looking to get back into some general math courses as a refresher. Any thoughts I where I should start? I've been looking into QPL (Quantum Programming Languages), and find that interesting as well. I'm currently in the position where I might be able to go to a University for free, and I would like to eventually pivot my career. I feel like a general refresher from the beginning would help, I know khan academy has a few courses, any thoughts on where to start from there?
Edit: You also might wanna reevaluate your point about quantum mechanics, "every quantum state can be represented as a sum of two or more other distinct states" and "the result will be another valid quantum state". (Source: Quantum Superposition)
The question is whether or not we can induce events we have not seen yet based on past events (by coming up with deterministic laws governing physics). So at best, you can achieve a near 100% probability.
For example, Newton used induction to come up with his laws of motion based on astronomy data collected by tycho brahe. He stated in Principia that the laws that govern these can be generalized to all objects - "In this experimental philosophy, propositions are deduced from the phenomena and are made general by induction. The impenetrability, mobility, and impetus of bodies and the laws of motion and law of gravity have been found by this method. And it is enough that gravity should really exist and should act according to the laws that we have set forth and should suffice for all the motions of the heavenly bodies and of our sea."
In other words, he observed the orbits of the planets, extracted out the laws of motion, and said they applied for everything. Its a big jump. So although the physical laws that humans have come up with are deterministic and have been verified empirically, you cannot know with 100% certainty that they apply deterministically to everything in the universe, because we have not observed everything in the universe. For a contrived example, if we are living in a simulation, and some coder changes some variables, then the laws of motion could change tomorrow. Or consider dark matter/ dark energy, from what I understand, our current laws of physics cannot describe the phenomenon we are seeing, so those laws don't apply to a vast majority of the universe.
When this thread was created, I had high hopes. Now this discussion has been interesting so far, but imho it not so much mathematics, but more physics/general science principles, mixed with some philosophical consequences.
Maybe I'm having a narrow view of mathematics, but maybe I dont and it is also time for a "physics Thread".
edit: Also unfortunately I dont know much about mathematics literature for non-mathematicians.
On June 13 2017 06:23 CecilSunkure wrote: Literature for mathematicians is perfectly fine. This is the math thread after all.
E: Also I vote to start banning goons coming in here to talk philosophy. It's annoying and pointless for a math thread.
A large chunk of computer science is based on discrete math, whose underpinnings are the same as philosophy. Famous philosophers such as Bertrand Russel and Descartes had huge contributions to mathematics. The overlap is real.
On June 13 2017 06:23 CecilSunkure wrote: Literature for mathematicians is perfectly fine. This is the math thread after all.
E: Also I vote to start banning goons coming in here to talk philosophy. It's annoying and pointless for a math thread.
A large chunk of computer science is based on discrete math, whose underpinnings are the same as philosophy. Famous philosophers such as Bertrand Russel and Descartes had huge contributions to mathematics. The overlap is real.
Sure overlap is real. Real annoying. Just open a new thread
Also I'm sure nobody would mind if the philosophy discussion was actually related to some real math discussion. But like 99% of the time that won't be the case.
Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?
I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn
RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha
On June 13 2017 13:13 Mr. Wiggles wrote: Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?
I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn
RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha
Well, functional analysis is a rather advanced topic. Typically, when you study it, you will have under your belt both real analysis and complex analysis. To take the most out of it you will need measure theory. For this reason most of the books are pretty difficult to read.
Now, if your math background is not that strong (i.e., you are not math major) the best book I believe is "Introduction to Functional Analysis" by Kreyszig. Here is the link www.amazon.com
I think this is the only book prior knowledge of any analysis.
I have to say that functional analysis was always at the top of the things to learn, but at this point I think I will never have time to properly study it. It is really fascinating but complex subject.
On June 13 2017 05:57 Mafe wrote: When this thread was created, I had high hopes. Now this discussion has been interesting so far, but imho it not so much mathematics, but more physics/general science principles, mixed with some philosophical consequences.
Maybe I'm having a narrow view of mathematics, but maybe I dont and it is also time for a "physics Thread".
edit: Also unfortunately I dont know much about mathematics literature for non-mathematicians.
Please, no! The ask and answer stupid question already derail often enough, there is no need for a dedicated thread for derailment
it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?
On June 13 2017 13:13 Mr. Wiggles wrote: Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?
I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn
RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha
IMHO, a good introduction book to functional analysis is Rudin's Functional Analysis. I should note that it is a good book to study functional analysis; it is really useful to have at least basic understanding of linear algebra, real and complex analysis, and topology beforehand to understand the motivation behind the theory.
On the other hand, I've heard good things about Introductory Functional Analysis with Applications by Erwin Kreyszig for people who look for applications. I haven't read it myself though.
On June 14 2017 01:43 JimmyJRaynor wrote: it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?
Intuitively, the theorem simply states that the sum of infinitesimal changes in a quantity over time (or over some other variable) adds up to the net change in the quantity.
On June 14 2017 01:43 JimmyJRaynor wrote: it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?
On June 13 2017 13:13 Mr. Wiggles wrote: Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?
I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn
RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha
IMHO, a good introduction book to functional analysis is Rudin's Functional Analysis. I should note that it is a good book to study functional analysis; it is really useful to have at least basic understanding of linear algebra, real and complex analysis, and topology beforehand to understand the motivation behind the theory.
On the other hand, I've heard good things about Introductory Functional Analysis with Applications by Erwin Kreyszig for people who look for applications. I haven't read it myself though.
I disagree about Rudin's Functional Analysis. This book targets graduate math audience. Unless you have really strong math background that includes Measure Theory, that's not the book to start. But at the graduate level it is a great book.
Beside Kreyszig that Ingvar and I mentioned in our posts, another introductory books aimed at people with modest backgrounds are "Functional Analyisis for Beginners" by Saxe, and "Elementary functional analysis" by MacCluer. I own the latter and I found it quite easy to follow if you have some background in analysis.
I should say that I self-study quite a bit of math and I often find that it is best to own a couple of books. Because invariably you will get stuck at some points. But since you are doing things on your own there is no professor or classmate to ask or help you clarify your confusion. So I found that the best is to have a couple sources to consult.
On June 14 2017 01:43 JimmyJRaynor wrote: it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?
It's a little difficult without drawing but here's my best attempt.
I'm guessing you're thinking about integrals as area under a curve. (Alternatively, you could think of it as the inverse of differentiation, which would make the question trivial) So the question is, if I'm looking at the area under a curve, why does it change exactly as fast as the height of the original curve?
Well, area is height * width, at least for rectangles. The slope of the area function (derivative of the integral function) is rise over run taken at smaller and smaller intervals. So it's (height * width) / width as long we take a small enough interval where the area under the original function, f(x) looks like a rectangle. Substituting f(x) for height, we get that the derivative of the integral at some value x is:
(height of f(x) * width) / width = f(x), that is the value of the original function. So differentiating the area function gives back the original function.
There's a number of small assumptions there that aren't necessary but omitting them would make the explanation even more complicated.
Thanks for the recommendations, Lebesgue and Ingvar, I'll check them out!
I've got a bit of a weird math background, as I got three-ish years into a physics degree before switching into computer science and taking a few additional algebra courses that interested me. The end result is a mish-mash of different subjects, along with a bunch of implicit stuff that I learned in my non-math courses.
Integration is essentially multiplication of a function with an infinitesimal quantity, whereas differentiation is division of the function's change with an infinitesimal quantity.
I am not sure how to do it. I understand how n grows and I understand how i grows and I can relate them positionally in a list but I don't know how to write i as a function of n
I know we have some clever people at TL... here is a chance to show off
On June 15 2017 05:13 travis wrote: I am looking for a relation
the pattern is as follows
n: 1, 3, 7, 15, 31, 63 i: 2, 4, 6, 8, 10, 12
I need a formula for i as a function of n
I am not sure how to do it. I understand how n grows and I understand how i grows and I can relate them positionally in a list but I don't know how to write i as a function of n
I know we have some clever people at TL... here is a chance to show off
Introduce a new variable k and write n(k) and i(k). Now find the inverse, i.e. k(n). Substitute to get i(k(n)), or i(n).
edit: for example you could just say k is the position in the list, so you would have n = 2^k - 1 and i = 2k
On June 15 2017 04:06 Shalashaska_123 wrote: Hi, JimmyJRaynor.
Integration is essentially multiplication of a function with an infinitesimal quantity, whereas differentiation is division of the function's change with an infinitesimal quantity.
that's pretty good. i like that one. thx for posting.
Here are some resources. I took applied algorithms in spring as part of my professional masters program, and most of the stuff flew over my head. i only learned what i had to for homework. now i'm going back to try to understand the stuff. One of the most unexpected thing for me (and apparently other people who've taken applied algorithms before shared same surprise) is that the class is almost entirely theory. We did have some coding assignments, but most of it is math. I'll be posting more stuff as I go over my class notes. I just finished my masters and realized I haven't mastered anything.
I analyze my elements, reduce the amount of elements by 1, then cut the elements in half. Now I have 2 lists of elements, 1 of size (n-1)/2 and the other of size n/2.
I do this over and over until i am left with many lists of 2 elements (and lists of 1 elements.. but for this question we don't consider those).
So the question is, how many lists did we go through in total? where n was a list at the top, n/2 and n-1/2 were lists, ((n/2)-1)/2 and (n/2)/2 were lists, etc etc all the way down to our lists of 2. Lists of 1, left over, do not count.
Not sure I understand the question. Would this be how you do it for n=100: 100 2*50 4* 25 8*12 (and 4*1 which we discard) 16*6 32*3 32*2 (and 32*1 which we discard)
So 95 in total?
If so, approximately n/2 + n/4 + ... ~= n
A precise answer is harder, because you need to know how many times you end up with an odd number of elements in your list. There's almost certainly a numerical way of figuring that out, but I'm lazy right now.
On June 16 2017 08:34 DanielReeLee wrote: Hello I'm looking for some problem sets for multivariable and vector calculus. Could I have some recommendations
Check out the multivariable calculus course on MIT OCW.
What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.
On June 17 2017 04:38 Acrofales wrote: Not sure I understand the question. Would this be how you do it for n=100: 100 2*50 4* 25 8*12 (and 4*1 which we discard) 16*6 32*3 32*2 (and 32*1 which we discard)
So 95 in total?
If so, approximately n/2 + n/4 + ... ~= n
A precise answer is harder, because you need to know how many times you end up with an odd number of elements in your list. There's almost certainly a numerical way of figuring that out, but I'm lazy right now.
well, for n = 100
100 50 , 49 25, 24 24, 24
12, 12, 11, 12, 11, 12, 11, 12 5, 6, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6 2, 2, - oh god there is a lot, u get the idea it would stop at all 1s and 2s but the 1s don't count and what I want is the count of EVERY list of length > 1, including the original list and the lists in every step
On June 17 2017 04:38 Acrofales wrote: Not sure I understand the question. Would this be how you do it for n=100: 100 2*50 4* 25 8*12 (and 4*1 which we discard) 16*6 32*3 32*2 (and 32*1 which we discard)
So 95 in total?
If so, approximately n/2 + n/4 + ... ~= n
A precise answer is harder, because you need to know how many times you end up with an odd number of elements in your list. There's almost certainly a numerical way of figuring that out, but I'm lazy right now.
well, for n = 100
100 50 , 49 25, 24 24, 24
12, 12, 11, 12, 11, 12, 11, 12 5, 6, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6 2, 2, - oh god there is a lot, u get the idea it would stop at all 1s and 2s but the 1s don't count and what I want is the count of EVERY list of length > 1, including the original list and the lists in every step
On June 17 2017 04:15 travis wrote: Now I have 2 lists of elements, 1 of size (n-1)/2 and the other of size n/2.
One of these is not an integer.
hmm yeah that's true what I really want is... uh.. the floor of (n-1)/2 .. I think. see above, lol
I expect the best way to solve this is to represent it with sums and then simplify them but I am not good enough
Oh, ok. That seems incomplete. The 6s would expand to 3, 2, right? And what would then happen? It ends there? Seems like a weird algorithm.
I thought I had a simple solution, but it breaks if any list in your subdivisions has length equal to a power of 2 (that adds 1, which can occur at different points in the tree). So doesn't work, and given the weirdness of your algorithm, I'm not sure there's an easy way of figuring out how often you'll run into a power of 2.
On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:
What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.
I apologize for my English.
Unless i missed something, the result is 5*sqrt(2).
The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)
At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.
On June 17 2017 05:39 hypercube wrote: Did you try some numerical experiments?
Are you interested in an exact expression or only asymptotic behaviour? The second one seems to be just on the order of n.
I am starting to think there isn't a reasonable way to get the exact expression. I've looked analysis of this algorithm up online and there doesn't seem to be any exact analysis online.
This is actually an analysis of best case quicksort comparisons that I am trying to do, if you are familiar.
I can see conceptually how this would be in the ballpark of n, but can you show the math for that?
Also I solved the actual values for 1 to 20. Itst a bit odd. nowhere near n for low values.
On June 17 2017 05:47 Liebig wrote: let f(n) be the result of your weird stuff for n
I've never seen recursion used this way, gonna take a minute to wrap my head around this lol edit: ok I get it. that's neat, But I am not as quick as you in using it to analyze the outcome
On June 17 2017 05:39 hypercube wrote: Did you try some numerical experiments?
Are you interested in an exact expression or only asymptotic behaviour? The second one seems to be just on the order of n.
I am starting to think there isn't a reasonable way to get the exact expression. I've looked analysis of this algorithm up online and there doesn't seem to be any exact analysis online.
This is actually an analysis of best case quicksort comparisons that I am trying to do, if you are familiar.
I can see conceptually how this would be in the ballpark of n, but can you show the math for that?
I've never seen recursion used this way, gonna take a minute to wrap my head around this lol
Well, it's m, such that m+1 is the greatest power of 2 less than n. However, add 1 for every power of 2 you encounter along the way. For instance n=16 has 8 sublists of size 2 (or greater), rather than 7. How many powers of 2 you'll encounter is hard to predict, but if you want asymptotic behaviour, take a completely absurd upper bound: say you have all m sublists with a size equal to a power of 2 (iow, all your sublists), even that would only give you 2m total lists (and is completely impossible), which is still O(n).
On June 17 2017 05:39 hypercube wrote: Did you try some numerical experiments?
Are you interested in an exact expression or only asymptotic behaviour? The second one seems to be just on the order of n.
Asymptotic is definitively at most order of n.
We can easily get an upper limit, because the kth step (with 0 being the initial set) produces <= 2^k new groups of numbers, and there are < log 2 (n) steps. Thus, there are definitively < Sum from 0 to log2(n) (2*k) = 2^(log2(n) + 1)-1 = 2n-1 total sets in this algorithm.
So, this is actually from homework that I have been working on for a long time. The actual question is "What is the fewest exchanges that the algorithm will execute for an input of size n. No justification needed."
(What we have been doing demonstrates the best case for exchanges)
In exact counts, with n elements, if n =3 we do only 1 exchange. If n = 7, we do only 3 exchanges. if n = 15, we do 7 exchanges. After that it is always worse than this.
So does that mean that a technically correct answer to this question is: "(the floor of n/2) - 1" ?
On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:
What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.
I apologize for my English.
Unless i missed something, the result is 5*sqrt(2).
The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)
At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.
Thank you for your reply.
If I understand it correctly, gondola turned 45 degrees and touches sides of the canal at both ends and touches the 'inner' edge of the turning with its side. That is the diagonal of the (2+3)*(2+3) square. Yes, it can go on, but I don't think it could go back, unless it bends. To imagine, I propose to make the difference of widths greater, like 1 before turn and 100 after.
On June 17 2017 05:36 Acrofales wrote: I thought I had a simple solution, but it breaks if any list in your subdivisions has length equal to a power of 2 (that adds 1, which can occur at different points in the tree). So doesn't work, and given the weirdness of your algorithm, I'm not sure there's an easy way of figuring out how often you'll run into a power of 2.
I think I have an idea. Let's write the sets in rows like travis did, so the k-th row will have 2^(k-1) sets. So if we need K rows, we will have 2^K - 1 sets in total. Unfortunately, this only works if we terminate in a perfect row (e.g. a combination of 3s and 2s, that can't be broken down with our rule).
So here's an outline:
0. Observe that there is at most two unique (and consecutive) elements in any given row.
1. Prove that there is at most 1 non-perfect row. That is a row which has both elements of n > 1 and n <= 1. I will refer to this as the final row.
2. Give a formula for the sum of elements in a row. This is doable because exactly one of n and n-1 so the sum of the child elements is one less than the parent element.
3. Use this formula to calculate the number of rows, up to and including the first non-perfect row. (call this K)
4. Use the same formula to calculate the number of ineligible elements in the final row. (call this D)
Using the original idea, the final answer is 2^K - D - 1.
edit: 3 seems the trickiest, I'm pretty sure I can do the others.
On June 09 2017 04:13 CecilSunkure wrote: I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.
Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!
Well it took a while, but after some studying I ended up getting down an implementation. I referenced "Computational Frameworks for the Fast Fourier Transform" by Charles Van Loan. I also found two good reference implementations here and here. The first link is an implementation of some algorithms from the Van Loan book, which helped show a proper way to convert algorithms to C and verify for correctness. The second link by Paul Bourke is probably the most popular FFT implementation in existence. The Bourke implementation has a sort of nasty side-effect of scaling floating point error across the DFT, so for large inputs the results can zero out early. This can easily be fixed by adding in some direct cos/sin calls, so I did that.
In the end I got down my own implementation after referencing all 3 pieces mentioned above. Luckily all the Van Loan book is in vector and matrix algebra, something I'm already good at.
Anyways, here's my current implementation. I'll probably work on it a little more to create a more useful API. Also the final scale loop in the if statement ( sign < 0 ) would be a good candidate for SIMD ops. My implementation is in a more modern style than the older ones, so it should be a little faster due to less branch misses in favor of a little more raw computation. Also there's a symmetry to cut computational cost in half that I've yet to implement. But this is fine for now:
uint32_t PopCount( uint32_t x ) { uint32_t a = x - ((x >> 1) & 0x55555555); uint32_t b = (((a >> 2) & 0x33333333) + (a & 0x33333333)); uint32_t c = (((b >> 4) + b) & 0x0f0f0f0f); uint32_t d = c + (c >> 8); uint32_t e = d + (d >> 16); uint32_t f = e & 0x0000003f; return f; }
uint32_t Log2( uint32_t x ) { uint32_t a = x | ( x >> 1 ); uint32_t b = a | ( a >> 2 ); uint32_t c = b | ( b >> 4 ); uint32_t d = c | ( c >> 8 ); uint32_t e = d | ( d >> 16 ); uint32_t f = e >> 1; return PopCount( f ); }
// x contains real inputs // y contains imaginary inputs // count must be a power of 2 // sign must be 1.0 (forward transform) or -1.0f (inverse transform) void FFT( float* x, float* y, int count, float sign ) { int exponent = (int)Log2( (uint32_t)count );
// bit reversal stage // swap all elements with their bit reversed index within the // lowest level of the Cooley-Tukey recursion tree for ( int i = 1; i < count - 1; i++ ) { uint32_t j = Rev32( (uint32_t)i ); j >>= (32 - exponent); if ( i < j ) { float tx = x[ i ]; float ty = y[ i ]; x[ i ] = x[ j ]; y[ i ] = y[ j ]; x[ j ] = tx; y[ j ] = ty; } }
// for each recursive iteration for ( int iter = 0, L = 1; iter < exponent; ++iter ) { int Ls = L; L <<= 1; float u1 = 1.0f; // cos( pi / 2 ) float u2 = 0; // sin( pi / 2 )
// rows in DFT submatrix for ( int j = 0; j < Ls; ++j ) { // do butterflies upon DFT row elements for ( int i = j; i < count; i += L ) { int index = i + Ls; float t1 = u1 * x[ index ] - u2 * y[ index ]; float t2 = u1 * y[ index ] + u2 * x[ index ]; x[ index ] = x[ i ] - t1; y[ index ] = y[ i ] - t2; x[ i ] += t1; y[ i ] += t2; }
// Rotate u1 and u2 via Givens rotations (2d planar rotation). // This keeps cos/sin calls in the outermost loop. // Floating point error is scaled proportionally to Ls. float z = u1 * wr - u2 * wi; u2 = u1 * wi + u2 * wr; u1 = z; } }
// scale factor for forward transform if ( sign > 0 ) { float inv_count = 1.0f / (float)count; for ( int i = 0; i < count; i++ ) { x[ i ] *= inv_count; y[ i ] *= inv_count; } } }
I am glad I set you guys off on a fun math journey. I figured out I was doing the question I was assigned completely wrong, which explains why it was so difficult. LOL
On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:
What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.
I apologize for my English.
Unless i missed something, the result is 5*sqrt(2).
The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)
At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.
Thank you for your reply.
If I understand it correctly, gondola turned 45 degrees and touches sides of the canal at both ends and touches the 'inner' edge of the turning with its side. That is the diagonal of the (2+3)*(2+3) square. Yes, it can go on, but I don't think it could go back, unless it bends. To imagine, I propose to make the difference of widths greater, like 1 before turn and 100 after.
Hm, i calculated it another way, and got a better result:
Just take a look at any rectangle which touch the corner of the canal. Before (and after) the turn, they all have the lengths (2+x) and (3+y), where 2/y = x/3 (intercept theorem). Thus, we are looking at the diagonal of a rectangle with sides (2+x) and (3+6/x), and especially for the minimum of those diagonals.
The diagonal has the length of
f(x) = Sqrt((2+x)²+(3+6/x)²) and thus the derivative of: f'(x) = 1/2f(x) * (2*(2+x)- (12/x²)*(3+6/x) Which you would now need to find the zeroes of. (I had mathematica do it because i am lazy, the zero is apparently at (18)^(1/3). (And further checks will almost certainly prove it to be a minimum That means that the lowest possible length of the gondola is f( (18)^(1/3)), which you can simplify to Sqrt (13 + 6*18^(1/3) + 3* 18^(2/3)), which is about 7.02, and indeed slightly less than 5Sqrt(2). Sorry for that.
On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:
What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.
I apologize for my English.
Unless i missed something, the result is 5*sqrt(2).
The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)
At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.
Thank you for your reply.
If I understand it correctly, gondola turned 45 degrees and touches sides of the canal at both ends and touches the 'inner' edge of the turning with its side. That is the diagonal of the (2+3)*(2+3) square. Yes, it can go on, but I don't think it could go back, unless it bends. To imagine, I propose to make the difference of widths greater, like 1 before turn and 100 after.
Hm, i calculated it another way, and got a better result:
Just take a look at any rectangle which touch the corner of the canal. Before (and after) the turn, they all have the lengths (2+x) and (3+y), where 2/y = x/3 (intercept theorem). Thus, we are looking at the diagonal of a rectangle with sides (2+x) and (3+6/x), and especially for the minimum of those diagonals.
The diagonal has the length of
f(x) = Sqrt((2+x)²+(3+6/x)²) and thus the derivative of: f'(x) = 1/2f(x) * (2*(2+x)- (12/x²)*(3+6/x) Which you would now need to find the zeroes of. (I had mathematica do it because i am lazy, the zero is apparently at (18)^(1/3). (And further checks will almost certainly prove it to be a minimum That means that the lowest possible length of the gondola is f( (18)^(1/3)), which you can simplify to Sqrt (13 + 6*18^(1/3) + 3* 18^(2/3)), which is about 7.02, and indeed slightly less than 5Sqrt(2). Sorry for that.
Good job!
There is another way, instead of using intercept theorem. You consider diagonals that touch the edge with their sides and finds the minimal one, using \alpha -- angle between the latter direction of the canal and the diagonal. We want to minimize function
l = l_1+l_2 = 3/sin(\alpha) +2/cos(\alpha)
I got something like \alpha_min = \arctan \sqrt[3](3/2), which is interesting to me. If we generalize this a bit, we take a, resp. b as parameters, instead of 2, resp. 3 and think of symmetries of this problem, we conclude
On June 12 2017 02:45 CecilSunkure wrote: So I've spent a day studying Fourier Transform. So far the math is very understandable, at least on how to apply it. Understanding why it works and specifically what it does is more challenging. Most resources on the subject are just absolutely awful, and I may have to go out and buy and old text book to get good information.
It seems like FFT implementations popped up around the 80s-90s, and nobody ever really looked at them again in depth. Since then everyone takes and old implementation and wraps it. Well there's FFTW, but that library is a software engineering nightmare. It's a horribly obfuscated and heavyweight solution for a particularly small function. But what can you expect from a bunch of academics.
You need to tread FFTW3 as a black box, because that is what it is supposed to be. It is used by Matlab and free to use for academics. So it is an industry standard.
The real challenge is to be able to derive all the math and understand it in all its correctness. Only this way can I write a good implementation.
Well, this is just not true. You implement it as a black box. There is no point understanding what is inside, as you are just implementing it. If you want to improve it or change it, then you need to know what is inside the box.
BTW, when I did some FFTW stuff and I wanted to test correctness, I got into difficulties because defining PI with 12 decimals, or even 30, wasn't accurate enough, and rounding would cause funky stuff, resulting in frequency bins that weren't supposed to fill up, to be filled, because of oscillations in how each sin or cos call was rounded each period.
On June 17 2017 10:29 BrTarolg wrote: Man, i have a degree in maths with a focus on pure
I *still* have no idea what's going on most of the time, inc in this thread lmao
That experience I get all over again anew every 3 or 4 weeks. You talk to people or read some reference, and suddenly there exists a whole new field, with all it's own intricacies, that you never knew about, barely understand, and that you think you want to know more about. But that is just impossible. So you figure out some silly pragmatic workaround, accepting that you are just one person with finite time and finite talent.
On June 17 2017 20:28 Kadungon wrote: Well, this is just not true. You implement it as a black box. There is no point understanding what is inside, as you are just implementing it. If you want to improve it or change it, then you need to know what is inside the box.
But that's what I did. Improve and change it. Obviously not mathematically improve or change, just in terms of software engineering benefits.
i teach elementary school and we do probability. things like dice and spinners. basic stuff. was talking to a friend about probability and he gave me this problem. i've been unable to make sense of it. any help is appreciated.
You and I are playing a game. We each shout out a number 1-100 every 5 seconds. Once someone says a number, that person can't say it again. What's the probability we match at least once?
On June 23 2017 05:42 esla_sol wrote: i teach elementary school and we do probability. things like dice and spinners. basic stuff. was talking to a friend about probability and he gave me this problem. i've been unable to make sense of it. any help is appreciated.
You and I are playing a game. We each shout out a number 1-100 every 5 seconds. Once someone says a number, that person can't say it again. What's the probability we match at least once?
The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:
1-!100/100! which approximately equals 63 percent.
I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed
Had to take three stats courses to get my degree and I still have no idea what I'm doing when it comes to probability. Just keep guessing until I get the answer
On June 23 2017 04:15 travis wrote: Here's a math question I have.
i want to write a sum:
from i=1 to n of: 1/2*(i-1)i
But I don't actually want i to stop when i = n. I want i to stop when the sum of "i values" has reached n. How do I do that?
It's like if I could write my sum as "from i = i +1 to n"
Example:
If n = 10 I want my sum to go through i values of: 1, 2, 3, 4 - then STOP. Which gives me a sum of 0+1+3+6 = 10
Is the only way to do this to solve for the relationship between what I want i to do, and n? Because I am not sure what the relationship is.
You can write basically anything below the sum symbol as a condition, but usually you use these conditions to only describe individual elements of the sum, not the sum as a whole.
I would write it as the Minimum of { (your sum from 0 to n) | n € N, (your sum) > n0}, with n0 being the thing you want. That works as long as all of the things in the sum after this are positive. If you have a sum that has negative elements afterwards, it becomes more complicated, because the minimum could be later down the line.
The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:
1-!100/100! which approximately equals 63 percent.
I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed
that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?
So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)
On June 23 2017 11:21 Uldridge wrote: So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)
Isnt that just b = number of black cards left when drawing the first black card, c = total number of cards left in the deck b/c * b-1/c * b-2/c * b-3/c * b-4/c The -1/2/3/4 comes from one black card already being drawn and therefore the chance to draw antother black card decreases by 1/c
EDIT: So basically, the chance to draw a black card is always (black cards left) / (total number of cards left), and for the chance to draw multiple black caards in a row it is (chance for card 1) * (chance for card 2) ... * (chance for card n).
On June 23 2017 11:21 Uldridge wrote: So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)
Isnt that just b = number of black cards, c = cards b/c * b-1/c * b-2/c * b-3/c * b-4/c The -1/2/3/4 comes from one black card already being drawn and therefore the chance to draw antother black card decreases by 1/c
I think he's asking the probability of a sequence of 5 black cards appearing anywhere in the stack. I have calculated this in sequences with replacement. Without replacement it's harder and I'm not sure you can use the same approach at all, but with replacement it works as follows:
The chance of having 5 black cards in a row of simply 1/2^5. So with 52 cards, you have 52-5+1=48 different sequences of having 5 cards. So the chance of one or more of these being all black is simply 1 minus the chance of none of them being all black:
1 - (1- 1/2^5)^48. So with replacement, you'd have ~78% chance of getting a 5-card sequence of all-black cards. Without replacement the problem seems much harder, but perhaps this gives you an idea of how to approach it?
Edit: hmm, with replacement you should just be able to enumerate all the ways of getting a flush, right? I am missing something here, but a first stab: There are 47 cards left after your flush, so that's 47! possible orders. 48 places for starting your flush, so 48! combinations... and 26 choose 5 ways of actually making a flush, so (26,5)*48!/52! This seems way too small, so I probably have a mistake somewhere. But this approach seems promising.
On June 23 2017 11:21 Uldridge wrote: So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)
I expect that a closed-form expression for this probability will be rather complicated. I suppose you can get a recursive formula which then can be evaluated by a computer.
Let A(b,r) be the probability that you will have a streak of 5 black cards in a deck with b black cards and r red cards Like for an example, if you start with stack of 26 black cards (and no red cards), you obviously have a probabilty of 1 that you will get a streak of 5 black cards. Whereas in a stack with 26 red cards (and 0 black cards), the probability is 0.
Now suppose that you have a stack of b black cards and r red cards. If the first card is red (which happens with a probabilty of r/(b+r) ), then a streak of 5 black cards must be contained in remaining b black and (r-1) red cards, where it has probability A(b, r-1). The probabilty of getting a 5-black-streak in this case (first card is red) is therefore (r/(b+r)) *A(b, r-1) If the first card is black and second card is red (which happens with probability (b/(b+r))*(r/(b+r-1)), a 5-black-streak must be contained in the remaining (b-1) black and (r-1) red cards. The probabilty of getting a 5-black-streak in this case (first card is black, second card is red) is therefore (b/(b+r))*(r/(b+r-1)) *A(b-1, r-1). ..... now you iterate this idea for (1st card black, 2nd card black, 3rd card red),..... until (1st B, 2nd B, 3rd B, 4th B, 5th B) in which case you have 5-black-streak regardless of what is happening for the reamining cards.
Altogether, you obtain a recursive formula for A(b,r) with intial values A(b,0)=1 for b >= 5 and A(0,r)=0 for any r. (You might simplify by saying A(26, 5)=1. I suppose this should be easy to code once you've found all formulas and a computer should be able to find A(26,26), the value in question, in almost no time.
edit: I think may initial values are not sufficient yet for the recursion. But if you understand the principle, adding the missing initial values should not be much of a problem.
On June 23 2017 04:15 travis wrote: Here's a math question I have.
i want to write a sum:
from i=1 to n of: 1/2*(i-1)i
But I don't actually want i to stop when i = n. I want i to stop when the sum of "i values" has reached n. How do I do that?
It's like if I could write my sum as "from i = i +1 to n"
Example:
If n = 10 I want my sum to go through i values of: 1, 2, 3, 4 - then STOP. Which gives me a sum of 0+1+3+6 = 10
Your series can be written as
where k is defined by your requirement that the sum of the i values has to equal n.
All that we have to do now is solve equation (1) for k in terms of n. The finite sum on the left side is one we can look up in a table.
What we have here is a quadratic equation for k.
Use the quadratic formula to solve it.
We use the expression on the right for k since we want it to be a positive value. The one on the left is negative, so we discard it. Therefore, your sum can be written in terms of n as follows.
This sum can be evaluated. Switch the upper limit back to k for a moment and split up the sum like so.
The finite sum of i^2 and the finite sum of i can be looked up in a table and/or proved using mathematical induction.
Simplifying this gives us
Now we can plug in that value for k. Doing so and simplifying gives us the final result.
If we plug in n = 10, then the upper limit evaluates to 4 and the sum evaluates to 10, which is consistent with your example.
Is the only way to do this to solve for the relationship between what I want i to do, and n?
I'm reluctant to say it's the only way, but it is the most convenient way as far as I can see.
Because I am not sure what the relationship is.
Yeah, you are. You say yourself that you want the sum to stop when the sum of i values equals n. We write equation (1) from that fact.
The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:
1-!100/100! which approximately equals 63 percent.
I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed
that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?
On the bottom you have 100!, which is the total number of ways to order the 100 numbers. On the top you have !100, which is the number of different ways to order the 100 numbers such that none of the numbers are shouted at the same time. I have to admit I've never even heard of derangements or subfactorials before today.
On June 23 2017 05:42 esla_sol wrote: i teach elementary school and we do probability. things like dice and spinners. basic stuff. was talking to a friend about probability and he gave me this problem. i've been unable to make sense of it. any help is appreciated.
You and I are playing a game. We each shout out a number 1-100 every 5 seconds. Once someone says a number, that person can't say it again. What's the probability we match at least once?
If neither person was allowed to repeat the other's numbers the problem would be considerably easier (just break it down into a big conditional probability chain). I'm stumped.
The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:
1-!100/100! which approximately equals 63 percent.
I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed
that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?
!100 (subfactorial 100) is the numbers of ways you can derange 100 objects.
!1=0 because there is no way to derange one object !2=1 because there is one way to derange two objects (BA) !3=2 (BCA, CAB) !4=9 (BADC, BCDA, BDAC, CADB, CDAB, CDBA, DABC, DCAB, DCBA) !5=44 !6=265
On June 23 2017 04:15 travis wrote: Here's a math question I have.
i want to write a sum:
from i=1 to n of: 1/2*(i-1)i
But I don't actually want i to stop when i = n. I want i to stop when the sum of "i values" has reached n. How do I do that?
It's like if I could write my sum as "from i = i +1 to n"
Example:
If n = 10 I want my sum to go through i values of: 1, 2, 3, 4 - then STOP. Which gives me a sum of 0+1+3+6 = 10
Your series can be written as
where k is defined by your requirement that the sum of the i values has to equal n.
All that we have to do now is solve equation (1) for k in terms of n. The finite sum on the left side is one we can look up in a table.
What we have here is a quadratic equation for k.
Use the quadratic formula to solve it.
We use the expression on the right for k since we want it to be a positive value. The one on the left is negative, so we discard it. Therefore, your sum can be written in terms of n as follows.
This sum can be evaluated. Switch the upper limit back to k for a moment and split up the sum like so.
The finite sum of i^2 and the finite sum of i can be looked up in a table and/or proved using mathematical induction.
Simplifying this gives us
Now we can plug in that value for k. Doing so and simplifying gives us the final result.
If we plug in n = 10, then the upper limit evaluates to 4 and the sum evaluates to 10, which is consistent with your example.
Is the only way to do this to solve for the relationship between what I want i to do, and n?
I'm reluctant to say it's the only way, but it is the most convenient way as far as I can see.
Because I am not sure what the relationship is.
Yeah, you are. You say yourself that you want the sum to stop when the sum of i values equals n. We write equation (1) from that fact.
The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:
1-!100/100! which approximately equals 63 percent.
I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed
that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?
On the bottom you have 100!, which is the total number of ways to order the 100 numbers. On the top you have !100, which is the number of different ways to order the 100 numbers such that none of the numbers are shouted at the same time. I have to admit I've never even heard of derangements or subfactorials before today.
firstly: wow secondly, sorry HTKPZ but what you posted actually wasn't what I wanted, I don't know how I got confused. This is the solution I was looking for, since I needed things in terms of n instead of in terms of the iteration I want to stop at.
but now I am shocked because while this answer certainly seems correct, this seems way more advanced than anything I should be having to do for this class, lol
maybe not though... since I do actually understand it
I really don't understand what you are doing in this step:
-what am I doing wrong? why is this image not loading?
edit2: oh wait, I see what you are doing now! I misunderstood the picture, you are saying the entire sum = n. ok that makes sense
It's the final question there (2b). It works like the first one except for now, the sum goes from i=1 to k of (i-1)i/2
but this sum will be repeated Z times and after every Z repetitions of this inside sum, Z is raised to the next power of 2 (if it was 1, it becomes 2, if it was 2, it becomes 4, if it was 4, it becomes 8), and k is incremented by 1
But then finally all this needs to be related to n, where n is equal to the total times i was incremented over all repetitions
aaand this was probably all insanely confusing so I will write it like this
If our n = 17
Then we need to do the sum from i=1 to 1 of (i-1)i/2 ONE TIME Then we need to do the sum from i=1 to 2 of (i-1)i/2 TWO TIMES Then we need to do the sum from i=1 to 3 of (i-1)i/2 FOUR TIMES
If our n was 49
Then we need to do the sum from i=1 to 1 of (i-1)i/2 ONE TIME Then we need to do the sum from i=1 to 2 of (i-1)i/2 TWO TIMES Then we need to do the sum from i=1 to 3 of (i-1)i/2 FOUR TIMES Then we need to do the sum from i=1 to 4 of (i-1)i/2 EIGHT TIMES
and we don't worry about n-values that don't line up nicely
I could definitely use help.. I am about 75% sure I've got this much correct. I'll be working on it.. lol I'll try using what shalashaska showed. But first... lunch
Apologies if this has already been answered somewhere, but I wanted to ask this.
Its been almost 20 years since I did any serious kind of mathematics, and I don't have much exposure to mathematics in my current role as an application programmer.
Can any of you recommend some resources to start learning calculus and probability from scratch? I am looking for books that talk more about the theory rather than working out exercises. A few books I tried in the past turned out to be extremely dry affairs that jumped straight into problem solving without much background text.
On June 24 2017 00:49 Piledriver wrote: Apologies if this has already been answered somewhere, but I wanted to ask this.
Its been almost 20 years since I did any serious kind of mathematics, and I don't have much exposure to mathematics in my current role as an application programmer.
Can any of you recommend some resources to start learning calculus and probability from scratch? I am looking for books that talk more about the theory rather than working out exercises. A few books I tried in the past turned out to be extremely dry affairs that jumped straight into problem solving without much background text.
Thank you.
For the calculus portion, do you mean its theoretical foundations? If that is the case you are looking for an introductory text into real analysis, you can go on math.stackexchange to find all sorts of textbook recommendations. However, the typical answer to calculus theory/ real analysis has always been Rudin's real analysis textbook though there are friendlier versions out there.
On June 24 2017 03:54 ninazerg wrote: Does 1 + 1 ever not equal 2?
Usually, one defines 2 to be 1+1. In this sense, 1+1=2 always holds.
But you can have 1+1=0 for example, in which case 2=0 (so still 1+1=2). In this case, ifthere is already some object taking the role of "2", one typically writes 0 instead of 2, to have a unambiguous notation.
On June 24 2017 03:54 ninazerg wrote: Does 1 + 1 ever not equal 2?
Usually, one defines 2 to be 1+1. In this sense, 1+1=2 always holds.
But you can have 1+1=0 for example, in which case 2=0 (so still 1+1=2). In this case, ifthere is already some object taking the role of "2", one typically writes 0 instead of 2, to have a unambiguous notation.
Indeed. In the realm of natural numbers (or in the reals, or for rationals, or a lot of other realms), 1 + 1 = 2.
However, it is quite easy to conceive of an mathematical object worth study where that is not the case. Take, for example, the F2. That is the body of numbers modulo 2. In that body, there are only two objects, everything that is 0 modulo 2, and everything that is congruent 1 modulo 2. (One could of course also call these different.
It is quite obvious that 1+1 in that body equals, because 2 is equal to zero in that body.
Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.
Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.
Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.
Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.
Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.
I feel just the opposite. I get annoyed when people are working with rough estimates but use a value of pi that is correct to 5 decimal places. I know that's what some textbooks tell you to do, but come on people, use common sense.
So, this is kind of a crossover between CS and math. Not sure which thread to post this in. I guess math, but hopefully some people here understand it.
I have a boolean formula (A ^ B) ^ (C V D) V ( B ^ D) ^ ( ~C V ~D) etc etc We are looking for the asymptotic complexity of finding satisfiability(a combination of true and falses that makes the final output true), through brute force. The question is how many combinations of inputs are there, basically. I believe the answer is 2^n.
The 2nd part of the question says that we now have an ability to pass our algorithm both the conditions (A = true, B = false, C = true, etc etc - in our 2^n combinations) - but we can ALSO pass an integer "k". Our algorithm will return if we have satisfied the assignment, but will ALSO return if the boolean formula can be satisfied with "k" or less TRUE values.
The question for the 2nd part is what is the asymptotic complexity to find a satisfying assignment. Here is what I believe the answer is.
First, find the exact "k" value. This can be done by passing "k = n", then if it comes back true, cut "k" in half, pass it again. If it's false cut it in half between k and n, if it's true again cut it in half again.. etc etc. Keep dividing our portions by half until we get an exact k. This is log_2(n) operations.
Once we have an exact k, we know we need to pass that many trues into our algorithm. Therefore the total number of combinations to try will be (n choose k).
So I think the complexity this time will be log_2(n) + (n choose k), which asymptotically is just n choose k - which in the average case I think is much better than the first scenario.
Does this math look good? I just wanted to see if anyone spots any obvious mistakes.
Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.
Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.
I feel just the opposite. I get annoyed when people are working with rough estimates but use a value of pi that is correct to 5 decimal places. I know that's what some textbooks tell you to do, but come on people, use common sense.
I still agree. I mean pi at 5 decimals is as bad as taking 1 or 10. You can't replace pi by anything, just keep it. This has been the topic of many discussions at the (physics) lab :p
Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.
Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.
I feel just the opposite. I get annoyed when people are working with rough estimates but use a value of pi that is correct to 5 decimal places. I know that's what some textbooks tell you to do, but come on people, use common sense.
I still agree. I mean pi at 5 decimals is as bad as taking 1 or 10. You can't replace pi by anything, just keep it. This has been the topic of many discussions at the (physics) lab :p
It depends on *why* your error is what it is. If you have a 10% error because that's the limit of your instrument and there's no reasonable way to improve it, then sure use the best value of pi you can. But if you have a 10% error because you are just looking for a ballpark figure then what's the point? You might as well put that extra effort into improving your measurement or your model or whatever.
There's a difference between measurements and ballpark estimates. Both have errors, but when you estimate you create errors in order to keep moving along and get to some reasonable number as fast as possible. Saying that pi ~ 3.14 is no different than saying 900 + 95 ~= 1000
On June 24 2017 23:50 travis wrote: So, this is kind of a crossover between CS and math. Not sure which thread to post this in. I guess math, but hopefully some people here understand it.
I have a boolean formula (A ^ B) ^ (C V D) V ( B ^ D) ^ ( ~C V ~D) etc etc We are looking for the asymptotic complexity of finding satisfiability(a combination of true and falses that makes the final output true), through brute force. The question is how many combinations of inputs are there, basically. I believe the answer is 2^n.
The 2nd part of the question says that we now have an ability to pass our algorithm both the conditions (A = true, B = false, C = true, etc etc - in our 2^n combinations) - but we can ALSO pass an integer "k". Our algorithm will return if we have satisfied the assignment, but will ALSO return if the boolean formula can be satisfied with "k" or less TRUE values.
The question for the 2nd part is what is the asymptotic complexity to find a satisfying assignment. Here is what I believe the answer is.
First, find the exact "k" value. This can be done by passing "k = n", then if it comes back true, cut "k" in half, pass it again. If it's false cut it in half between k and n, if it's true again cut it in half again.. etc etc. Keep dividing our portions by half until we get an exact k. This is log_2(n) operations.
Once we have an exact k, we know we need to pass that many trues into our algorithm. Therefore the total number of combinations to try will be (n choose k).
So I think the complexity this time will be log_2(n) + (n choose k), which asymptotically is just n choose k - which in the average case I think is much better than the first scenario.
Does this math look good? I just wanted to see if anyone spots any obvious mistakes.
Soon we are going to need the Algorithms Thread.
I'm not sure I understand part 2, do you mean you can choose k out of n inputs arbitrarily? Or the first k of n?
In part 2 you can pass an integer "k". in addition to your series of true/false.
It then tells you if the formula is satisfiable with "k" true values or less. So if your formula needs 10 true values, and you pass it a "k" value of 9, it will return false for that.
It's not tied to the actual series of boolean variables you are sending, other than that they are both related to solving the formula. They could be 2 different functions for all it matters.
On June 25 2017 10:37 travis wrote: In part 2 you can pass an integer "k". in addition to your series of true/false.
It then tells you if the formula is satisfiable with "k" true values or less. So if your formula needs 10 true values, and you pass it a "k" value of 9, it will return false for that.
It's not tied to the actual series of boolean variables you are sending, other than that they are both related to solving the formula. They could be 2 different functions for all it matters.
Are there any restrictions on the formula? The example is (A op B) op (...) ..., so it's always the same pattern or brackets? Not sure about this off the top of my head. Naive solution is to check everything at O(2^n) but if you analyse the formula itself, shouldn't it be something is O(n) because you just count the operations in a certain way? Eg, (n-k) equals the number of ANDs or some such.
Edit: The problem I have with your original solution is that you are looking at the complexity of how many function calls to f(n, k) as a binary search on k. But I am wondering if the question is to figure out the complexity of f, how it determines what to return for a given k.
in all my years in academic mathematics, this is the best math problem i have ever encountered. so simple yet so frustatingly elusive. And the answer is elegant and massively satisfying.
problem: prove that 1+1=2 without using addition and numbers in your process and solution.
On June 25 2017 14:17 xwoGworwaTsx wrote: in all my years in academic mathematics, this is the best math problem i have ever encountered. so simple yet so frustatingly elusive. And the answer is elegant and massively satisfying.
problem: prove that 1+1=2 without using addition and numbers in your process and solution.
On June 25 2017 14:17 xwoGworwaTsx wrote: in all my years in academic mathematics, this is the best math problem i have ever encountered. so simple yet so frustatingly elusive. And the answer is elegant and massively satisfying.
problem: prove that 1+1=2 without using addition and numbers in your process and solution.
anybody want to have a go?
Peano arithmetic defines numbers as literals in a FOL. Does that count as "using numbers"? Otherwise you could do it in set theory, but it's basically the same.
Alright travis. I am going to split my answer into two parts. I think if you understand the logic behind Gauss sum, you can very easily apply the same logic to your original problem and arrive at the formulas you desire.
We will only regard even n in the following as the gauss sum for uneven n directly follows.
1) Gauss sum The idea behind Gauss sum is to make pairs: You take the first and the last element (1 and n) , the second and the second to last element (2 , n-1) and so on. Notice that all these pairs have sum n+1. Now you have the n elements split up into n/2 pairs of sum n+1, whence the whole sum is (n+1)*n/2 , gauss sum.
2) partial sums I advise you that if you understand 1), try to apply the same logic yourself before simply reading the solution. + Show Spoiler +
Now instead of summing 1+2+...+n you just havfe the sum 1+2+...n/2 . This means our pairs are (1 +n/2), (2+n/2 - 1) and so on. There are (n/2)/2 number of pairs as you have n/2 elements. this leaves us with (n/4)*(1+ n/2)=1/8 * n * (n+2)
Also I wanted to share in this thread one of my favorite proofs. The problem is very simple and so is the proof, once you see it, though it takes a lot of creativity to solve it yourself. I dare you to try before reading the solution
"For all prime numbers n with n greater or equal to 5 holds (n^2) - 1 is dividable by 24."
First we note that (n+1)*(n-1) = (n^2)-1. (*) as n is prime we know that n is uneven. As every second even number (4,8,12,...) is dividable by 4, we know that either n+1 or n-1 is divided by 4 and the other by 2. Hence we know that 8 divides (*). Now we regard n-1,n,n+1 as 3 neigbours. for three neighboured natural numbers, exactly one of them is divided by 3. And it is not n as n is prime, hence is is n-1 or n+1 . Therefore (*) is also divided by 3, therefore by 3*8=24.
I really like it because you can explain it to friends or family who dont know maths to explain what a proof is (or you can be that guy at a party).
On June 28 2017 05:58 Kleinmuuhg wrote: Also I wanted to share in this thread one of my favorite proofs. The problem is very simple and so is the proof, once you see it, though it takes a lot of creativity to solve it yourself. I dare you to try before reading the solution
"For all prime numbers n with n greater or equal to 5 holds (n^2) - 1 is dividable by 24."
First we note that (n+1)*(n-1) = (n^2)-1. (*) as n is prime we know that n is uneven. As every second even number (4,8,12,...) is dividable by 4, we know that either n+1 or n-1 is divided by 4 and the other by 2. Hence we know that 8 divides (*). Now we regard n-1,n,n+1 as 3 neigbours. for three neighboured natural numbers, exactly one of them is divided by 3. And it is not n as n is prime, hence is is n-1 or n+1 . Therefore (*) is also divided by 3, therefore by 3*8=24.
I really like it because you can explain it to friends or family who dont know maths to explain what a proof is (or you can be that guy at a party).
By extension of your proof, its applicable to all odd numbers that are indivisible by 3 and not necessarily prime.
On June 28 2017 04:43 travis wrote: Could someone help me with summation algebra
I keep getting problems where I am summing to some portion of n, say
the sum from i=1 to (n/2) of i
now, I know that the sum i=1 to n of i is gauss sum: 1/2n(n+1)
I know this because I have memorized it
but when we sum to n/2 it becomes 1/8n(n+2)
I know this because wolfram alpha tells me so
but how can I manually solve this sum? Is there some rule for what happens to the sum when I only sum to a portion of my "n" ?
It really isn't complicated. The formula for the finite sum as you know is this.
If the upper limit of the sum changes to n/2, then wherever we see an n in the formula, we replace it with n/2. Factor out a 1/2 to get Mathematica's answer.
That's why math and physics are awesome, so I thought I'd leave this here! YT video: Math vs Physics - Numberphile Check it out, if you have 13 minutes and 52 seconds to spare.
I tried my luck googling and going through youtube videoes but I couldnt find anything so here goes:
When you are solving a seperable differential equation by seperating the variables, how is splitting up dy/dx justified (apart from the fact that the technique is without a doubt producing the right answer) and what is happening? Do you multiply each side of the equation by an infinitely small change of x?
On August 03 2017 05:55 HKTPZ wrote: I tried my luck googling and going through youtube videoes but I couldnt find anything so here goes:
When you are solving a seperable differential equation by seperating the variables, how is splitting up dy/dx justified (apart from the fact that the technique is without a doubt producing the right answer) and what is happening? Do you multiply each side of the equation by an infinitely small change of x?
Ah yes, this question touches on higher level calculus and my uni professor's answer to it in a calculus class was more or less "don't worry about it". If you want a fairly abstract and readable discussion on it, I'd recommend this stackexchange answer, or, if you prefer, this more technical blogpost.
On August 03 2017 05:55 HKTPZ wrote: I tried my luck googling and going through youtube videoes but I couldnt find anything so here goes:
When you are solving a seperable differential equation by seperating the variables, how is splitting up dy/dx justified (apart from the fact that the technique is without a doubt producing the right answer) and what is happening? Do you multiply each side of the equation by an infinitely small change of x?
Ah yes, this question touches on higher level calculus and my uni professor's answer to it in a calculus class was more or less "don't worry about it". If you want a fairly abstract and readable discussion on it, I'd recommend this stackexchange answer, or, if you prefer, this more technical blogpost.
It could also be fun to note that in algebraic geometry you introduce [Kähler] differentials on curves and surfaces basically by saying "let's consider the vector space of things of the form dx" and just treating it formally, and it ends up behaving quite like the differential geometrical notions you're discussing
There is a lot of difference between the answers when studying maths and when studying physics. If you are studying physics, the answer is: Because it works.
If you are studying maths, you have to get into what total differentials mean.
I'd guess your first step is to learn about limits from a formal perspective, and then to learn about different kinds of spaces and representations of them and how you end up doing limits in them
Actually, mathematically this is not permitted, as it is not formally correct. Often times you would rather use substitutions that are permitted. Or you can (and this is common practice) informally calculate a solution using that trick and then afterwards show that it solves the diff eq. This however is no better than saying:oh look this function magically appeared and now I can prove it solves my issue.
On August 05 2017 01:52 Kleinmuuhg wrote: Actually, mathematically this is not permitted, as it is not formally correct. Often times you would rather use substitutions that are permitted. Or you can (and this is common practice) informally calculate a solution using that trick and then afterwards show that it solves the diff eq. This however is no better than saying:oh look this function magically appeared and now I can prove it solves my issue.
Which for differential equations is usually how it works anyways. You have theorems that prove that if a solution with the necessary amount of degrees of freedom exists, it is the only one. So if you have one, no matter where you got it from, that is all the information you need.
And i think if you interpret the "dx" and "dy" as total differentials, you can actually do make that approach work.
On August 05 2017 01:52 Kleinmuuhg wrote: Actually, mathematically this is not permitted, as it is not formally correct. Often times you would rather use substitutions that are permitted. Or you can (and this is common practice) informally calculate a solution using that trick and then afterwards show that it solves the diff eq. This however is no better than saying:oh look this function magically appeared and now I can prove it solves my issue.
Which for differential equations is usually how it works anyways. You have theorems that prove that if a solution with the necessary amount of degrees of freedom exists, it is the only one. So if you have one, no matter where you got it from, that is all the information you need.
And i think if you interpret the "dx" and "dy" as total differentials, you can actually do make that approach work.
Even if interpreted as total differentials there is no general concept of simplifying differentials in general, I think. Im not quite sure on that though, since "total differential" is something that I dont really meet in the kind of functional analysis im usually concerned with.. I seem to remember an exercise from a couple of years ago that gave an example in which simplifying the application of dx/dy * dy/dz * dz/dx to an equation to 1 would yield a wrong solution. Ill see if I can dig it up.
From my first couple of classes on ordinary differential equations I seem to remember that the proofs of the proper mathematical way of solving these problems involved integration of an inverse function and applying it to the original problem, which relies heavily on the involved functions being smooth homeomorphisms (at least locally) and ranges/domains getting along. These two things are often times not checked at all by engineers/physicists that Ive been working with.
Edit: Found it! By an application of the implicit function theorem, you can show that for the function F with open domain Omega and for positive constants a,b,R
(Here it is assumed that the function has a zero in D, i think).
Interpreting p as pressure, v as molar volume and T as absolute temperature, the equation F(p,v,T)=0 (or p=RT/(v-b) - a/(v^2) ) should be the Van-der-Vals-equation, which approximates the behaviour of real gases, iirc.
So yeah, in this case, the differentials cannot be simplified, as that would yield 1 instead of -1 in the above equation.
TL are my bros so yall get this secret: The best math resource I have EVER found is this blog: https://betterexplained.com/cheatsheet/ (I have a Masters in Chemistry with Math and moonlight as a high school Math teacher, and this shit is so legit it helps my students and helps me like crazy)
If we want science to go really fast, and get to the future sooner, we need to massively speed up our math education (especially in the West). We have the Education knowledge of 'how to teach well', we just need to apply that to our math classes. This BetterExplained blog does that brilliantly, and I strongly suggest everyone check it out.
Back in school, doing a linear algebra review assignment.
It has me plot a couple vectors in R2 and then the vectors under a transformation.
The transformation matrix is [0 1; 1 0] --->(the ; denotes the next line of the matrix)
So I can see that what is happening to my vectors is that the x and y coordinates are swapped.
The assignment wants me to "Describe geometrically what T does to each vector x in R2". Is my answer that the coordinates are swapped an acceptable answer? Is there some other way to describe what this transformation matrix is doing? Thank you.
Swapping coordinates doesn't sound very geometrically.
I assume the answer that is expected is "mirrored on the straight line that goes from (0/0) to (1/1) (and further)" (Don't know what it is called in english, it has a special name in german. Could be something like "first Median")
Obviously any other straight line that goes in a 45° angle between the coordinate axis would also be fine if you are only talking about vectors. If you are talking about points, you need exactly that one.
A geometric description is something like a rotation/reflection/translation. They are almost certainly looking for the word "reflection" and what the axis of reflection is.
What do you mean by "or whatever axis labels you are using"? Using x and y is universal for the horizontal and vertical axes in R^2 afaik. Although it is weird that they would also use 'x' for a generic vector.
For Simberto; Spiegelung = reflection here. We don't have a special word for the x=y line.
On September 05 2017 01:55 travis wrote: In my book they tend to label the axes x_1, x_2... x_n
That generalises to arbitrarily many dimensions, but using x_1, x_2 is unnecessary for R^2. Tbh, I don't think I often see the axes labelled for R^n. I think it was always referred to as the i^th coordinate or i^th axis. That keeps x = (x_1, .. , x_n) free to use as an arbitrary point in R^n.
Then the way to refer to a line in the space is by using a scaling variable. You can always describe a line by giving one point p on the line and the gradient vector v, making the line r v + p for all real r. So the line y=x would be r (1,1).
Anyhoo, this is all about notation, rather than anything useful for the question and may have just confused matters more
Notation is boring, and usually people know what you mean regardless. The people who worry about notation are usually those who are very uncertain about the subject matter at hand. People who know their shit usually just swap notation on the fly to whatever fits better for the current situation.
On September 05 2017 04:12 Simberto wrote: Notation is boring, and usually people know what you mean regardless. The people who worry about notation are usually those who are very uncertain about the subject matter at hand. People who know their shit usually just swap notation on the fly to whatever fits better for the current situation.
I had a course with a lecturer who wrote all functions on the right and all group actions as right actions. It got very messy and confusing. Even if you knew what was going on and what you wanted to say it could be very difficult to write it properly.
Notation is just another form of language and - just like language - some people are very fussy and pedantic and others are not. If you are too careless with notation/language then it can make things harder to understand and explain. If you are too pedantic then you can drown in details. Most people are somewhere in the middle.
So, I've been thinking a bunch about array signal processing, or about a specific problem, and the closest math I can find to it is array signal processing.
The problem is thus: Assume you have a regular grid of points p_m. Each point has a value v_m. You have N sensors s_n at known positions. You can measure each sensor s_i, and the result is the sum, for all points, p_j / dist(p_j->s_i)^2 .
Given the measurements at each sensor, how well can you guess the values at each point? I think the main difference between this and traditional array processing is that we assume M >> N, and we aren't concerned with the positions of the points, as we assume them to be static and known.
Does anybody have any recommendations of stuff I could look up to learn more about this? Any terms that relate to this subproblem specifically?
Also, I've got a couple basic questions that might be answerable by anyone knowing any array signal processing at all. Of course if you make your sensors more spread out, you can get a better approximation. And if you get more sensors, over the same area, you get a better approximation. But can anyone give me a rough idea how much each of these helps? Like is it doubling the spread halves the error on far points, and squaring the number of sensors halves the error on far points?
On September 04 2017 23:40 travis wrote: Back in school, doing a linear algebra review assignment.
It has me plot a couple vectors in R2 and then the vectors under a transformation.
The transformation matrix is [0 1; 1 0] --->(the ; denotes the next line of the matrix)
So I can see that what is happening to my vectors is that the x and y coordinates are swapped.
The assignment wants me to "Describe geometrically what T does to each vector x in R2". Is my answer that the coordinates are swapped an acceptable answer? Is there some other way to describe what this transformation matrix is doing? Thank you.
You have one die. If you roll a 1, you win the prize. If you roll anything else, you don't win the prize but you are free to roll again. If your next roll is a number higher than your last, or is a 1, you win. Otherwise, you roll again. You continue rolling until you win.
I'm trying to figure out how to calculate with high precision (without simulating) the expected number of rolls it takes on average to win the prize. I know there are ways like markov chains that allow you calculate the chances of having won after x rolls, but I'm looking for the number of rolls on average to win. Any help would be appreciated, thank you.
On September 13 2017 06:09 enigmaticcam wrote: Question on probabilities:
You have one die. If you roll a 1, you win the prize. If you roll anything else, you don't win the prize but you are free to roll again. If your next roll is a number higher than your last, or is a 1, you win. Otherwise, you roll again. You continue rolling until you win.
I'm trying to figure out how to calculate with high precision (without simulating) the expected number of rolls it takes on average to win the prize. I know there are ways like markov chains that allow you calculate the chances of having won after x rolls, but I'm looking for the number of rolls on average to win. Any help would be appreciated, thank you.
This definitely isn't the right way to go about it, but: 1/6 chance to win immediately. If not, then 15 out of 30 of your second rolls will result in a win. So, half of your second rolls will win it for you, and with if you add your 1/6 to it, then you have higher than 50% to win on your second roll, so on average you win by roll 2.
On September 13 2017 06:27 Dark_Chill wrote:This definitely isn't the right way to go about it, but: 1/6 chance to win immediately. If not, then 15 out of 30 of your second rolls will result in a win. So, half of your second rolls will win it for you, and with if you add your 1/6 to it, then you have higher than 50% to win on your second roll, so on average you win by roll 2.
I need it in high precision though, say three past the decimal point. I can easily simulate it: after 10 million games I win on average 2.441 times. Just trying to figure out how to calculate that.
I've tried blowing it out to all possibilities and averaging the number of turns it took to win for each possibility, but I'm always stumped with how to measure those where you haven't yet won. They can't be 0, otherwise that would be as if you won without even playing, and they can't be infinity as that would just be silly. I'm sure there's a better way to do it.
On September 13 2017 06:27 Dark_Chill wrote:This definitely isn't the right way to go about it, but: 1/6 chance to win immediately. If not, then 15 out of 30 of your second rolls will result in a win. So, half of your second rolls will win it for you, and with if you add your 1/6 to it, then you have higher than 50% to win on your second roll, so on average you win by roll 2.
I need it in high precision though, say three past the decimal point. I can easily simulate it: after 10 million games I win on average 2.441 times. Just trying to figure out how to calculate that.
I've tried blowing it out to all possibilities and averaging the number of turns it took to win for each possibility, but I'm always stumped with how to measure those where you haven't yet won. They can't be 0, otherwise that would be as if you won without even playing, and they can't be infinity as that would just be silly. I'm sure there's a better way to do it.
Consider the expected number of rolls required after you roll a 2, call it E[2]. We know that it'll take at least one more roll and then there will be two situations, based on whether you roll a 1,3-6 or a 2.
E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1.2
So once a 2 is rolled, it'll take an expected number of 1.2 more rolls to win the game. Similarly for 3, but now we know the expected number of rolls we would need if we rolled a 2 next:
or, in Python terms: sum([1+1.2**i for i in range(1,6)] + [1])/6
This should be an exact answer.
(There's a bunch of stuff regarding Bernoulli distributions and things that I think you could set up if you wanted to keep it all purely symbolic, but I'm 98% confident that this simple approach is completely valid, and that I'm not cheating anywhere. I'd love to be informed if that's not the case!)
Answer is converging, so set inf to be really large number to calculate, and you'll get more or less the exact answer, with only 100, my error is on the scale of 1E-69.
DO CHECK: SUM( W%Real(1): W%Real(inf)) = 1 , good.
W%weighted (1, 2, ... , n) = (1*W%Real, 2*W%Real, ... , inf*W%Real) AVERAGE = SUM( W%weighted(1): W%weighted(inf)) = 2.48832 (truncating after the first 100 rolls)
Well, I spent a good 3 hours doing this, not very rigorous at all, but I'm 99.99% sure the answer is right. I'm curious to see the mathematically beautiful way to get this answer if there is one.
On September 13 2017 14:48 Acrofales wrote: @fiwi: a mathematically elegant way of doing it is in the answer above yours.
Oh, I'm just too dumb to understand it, I see.
edit: Oh nevermind, I get it. I read the problem and I got too excited about trying to solve it rather than properly reading the comments that followed.
A right answer is a right answer Shows the difference between how an engineer vs how a mathematician approaches a problem.
On September 13 2017 14:48 Acrofales wrote: @fiwi: a mathematically elegant way of doing it is in the answer above yours.
Oh, I'm just too dumb to understand it, I see.
edit: Oh nevermind, I get it. I read the problem and I got too excited about trying to solve it rather than properly reading the comments that followed.
A right answer is a right answer Shows the difference between how an engineer vs how a mathematician approaches a problem.
On September 13 2017 08:47 raNazUra wrote:Consider the expected number of rolls required after you roll a 2, call it E[2]. We know that it'll take at least one more roll and then there will be two situations, based on whether you roll a 1,3-6 or a 2.
E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1.2
How did you figure out what E[2] is in this equation? I'm not sure how you got from "E[2] = 1 + 0*5/6 + E[2]*1/6" to "E[2] = 1.2".
On September 13 2017 08:47 raNazUra wrote:Consider the expected number of rolls required after you roll a 2, call it E[2]. We know that it'll take at least one more roll and then there will be two situations, based on whether you roll a 1,3-6 or a 2.
E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1.2
How did you figure out what E[2] is in this equation? I'm not sure how you got from "E[2] = 1 + 0*5/6 + E[2]*1/6" to "E[2] = 1.2".
E[2] is, at the end of the day, some number. That means that I can treat it like I treat any other variable and do standard algebra to solve for it:
E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1 + E[2]/6 #Clean up E[2]*5/6 = 1 #subtract E[2]/6 from each side E[2] = 1.2 #multiply by 6/5
Probably would have been more clear had I not switched from fractions to decimals halfway through.
On September 13 2017 14:48 Acrofales wrote: @fiwi: a mathematically elegant way of doing it is in the answer above yours.
Oh, I'm just too dumb to understand it, I see.
edit: Oh nevermind, I get it. I read the problem and I got too excited about trying to solve it rather than properly reading the comments that followed.
A right answer is a right answer Shows the difference between how an engineer vs how a mathematician approaches a problem.
i.e. Python vs Excel :p
Matlab if I have an idea about the method of the solution, Excel when I have no idea how to start, Mathematica if I want to be a smart ass instead of being practical ^^.
But that was all abou being clever and finding how to start it, not about the software.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
meeh :/ i tried using latex to bbcode converter but it doesnt work the way i want it to So in short and without formality:
You want the proof that the limit f(x), x -> 0 with f(x) = (b^x - 1)/x exists. Did i get that right? You proof this by showing that f is strictly monotonic increasing in R+, which you can do by using the mean value theorem. Hence, the limit f(x), x -> 0 equals inf{ f(x) | x in R+} =: s. (1) Now you take a convergent seqeuence in R\{0}, h_n, with limit h_n -> 0 and show limit |h_n| = s (via monotonic increase and infimum property). You can easily show that f(-h) = b^(-h) f(h) with h in R+. Also x -> b^x is continuos in 0, so limit b^(-|h_n|) = b^0 = 1. Hence, limit b^(-|h_n|) f(h_n) = 1 s = s. (2)
d in R+. From (1) follows theres N_1 in Naturals with |f( |h_n| ) - s| < d for all n > N_1. From (2) follows theres N_2 in Naturals with | b^(-|h_n|) f( |h_n| ) - s | < d for all n > N_2. For all n in Naturals is f(h_n) either f( |h_n| ) or b^(-|h_n|) f( |h_n| ), so | f(h_n) - s | < d for all n in Naturals with n > max{N_1 , N_2}.
Hope this is helpful. Never did do any math in english so im lacking all the vocabulary. however, i hope you can follow my "proof".
Edit: There is an elegant way to determine the derivate of a^x (and hence the limit you are looking for). Which is excactly what the dude wrote before me: a^x = exp( x ln(a) ) and it tells you the limit is ln(a). I dont think you need to look any further for a clever and elegant way. Because you already use it all the time The common derivation rules and the fact that exp is derivable on the entire definition range. (Btw. you dont need L'H to proof any of this.)
Can't really answer on how to approach the limit question without knowing exponential before, but my 2 cents:
So if you assume you don't know exponential, the only way (i know of) that you can define a^x is as the unique solution to the functional equation f(x+y)=f(x)*f(y) and f(1)=a for a > 0.
So f'(x) must be such that f'(x+y)=f'(x)*f(y) So, now you have f'(x)=f'(0)*f(x)
And now I'm stuck cause one of the definition of e^x is as the solution of differential equation f'(x)=f(x)
But that gives an answer to why there is some constant times the function itself.
I think one of the working definitions of e is that the derivative of e^x is e^x. If you choose any of the different ways to define e, you can prove from there that that sentence is true. I don't remember the specifics, but it works.
If you have that as set, and also have the chain rule of calculus proven, you can take the way that Liebig explained above.
And that is an absolutely fine way to prove the thing you wanted to prove, or to approach the series. You know other things, and derive the result from them. So you don't know what the derivative of a^x is, but you know what the derivative of e^x is, either by definition or by something that follows from that definition.
You are trying to take the way in the opposite direction. You don't start with proving what the derivative of a^x is, and go to e^x from there, you go the other way and start with e^x, and go to a^x.
exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
But besides that i agree with you, simberto. It feels like HKTPZ goes about it the wrong way. Thats also what i was trying to say in some way: You already know d(a^x)/dx = ln(a) a^x because of the chain rule and exp properties you use daily, d(a^x)/dx = d(exp(x ln(a)))/dx = exp(x ln(a)) ln(a) = a^x ln(a). And d(a^x)/dx = lim ( a^(x+h) - a^x )/h = a^x lim (a^h - 1)/h with h -> 0. In short a^x ln(a) = d(a^x)/dx = a^x lim (a^h - 1)/h.
Commonly you calculate d(a^x)/dx = a^x ln(a) via derivation rules. These are no rules of thumb but proven generalization that work always for the right parameters. So the clever and elegant way you are looking for to determine the limit lim (a^h - 1)/h, is the set of derivation rules and exp properties.
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.
Edit: Also, from the infinity series of the exp function to d/dx exp(x) = exp(x) you just need to derive polynoms. And derivatives of polynoms are easily proven by the definition of the derivative. So with regards to the original question, it doesn't really matter with which one we start. The way to d/dx a^x (and the limit in question) from not a lot of other stuff is: Prove d/dx x^n = n x^n-1 prove chain rule define exp = infinite series x^n/n! do some work to get a reasonable definition of ln(x) derive d/dx exp(x) = exp(x) define a^x as exp(x lna) derive via ruleset that has been proven.
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.
Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.
Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.
I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).
Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.
Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.
Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.
Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.
I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).
Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.
Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>
If you define e^x as an infinite series, then the derivative is obviously the derivative of that infinite series. You don't need the general case, you need this very specific case. You should of course prove f(x) = f'(x) for f(x) = e^x, using whatever definition of the derivative has your preference.
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.
Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.
Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.
I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).
Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.
Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>
If you define e^x as an infinite series, then the derivative is obviously the derivative of that infinite series. You don't need the general case, you need this very specific case. You should of course prove f(x) = f'(x) for f(x) = e^x, using whatever definition of the derivative has your preference.
Not sure what youre trying to say? Or even what part of my post you are referring to exactly? If it's about the last part..: "The derivative of the series" would be d/dx \lim_n\sum_{k=0}^n x^k/k! and all that I said is that you need a reason for why this would equal \lim_n d/dx \sum_{k=0}^n x^k/k!, i.e., a reason for why you are allowed to exchange the order of derivative and limit. This is NOT trivial or true for arbitrary series. It is however exactly what you need to do to reduce the case of differentiating the exponential function to the case of differentiating polynomials, i.e., applying the differentiation to the finite sums.
I know that in some cases in courses for engineers (or even physicists) people pretend like limits can be arbitrarily exchanged. Time to accept that this is just wrong and only ever done because it would be too time-consuming to give the full picture.
Clearly, I didnt manage to ask my question the right way so I will try to rephrase ^^
I know a^x can be rewritten as e^(x*lna) and I know the derivative of e^x is e^x times ln e which simplifies to e^x and I know Leibniz' chain rule.
Im asking: WITHOUT using that knowledge, is there a clever and elegant way to show that lim ((a^h)-1)/h as h goes to 0 is exactly 1/n of lim (((a^n)^h)-1)/h as h goes to 0 implying there is a connection to log.
I can show that lim ((a^h)-1)/h as h goes to 0 exists but Im at a loss trying to show the limit is ln a (what the base is is trivial - the point is to show the connection to logarithmic behaviour)
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.
Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.
Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.
I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).
Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.
Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>
If you define e^x as an infinite series, then the derivative is obviously the derivative of that infinite series. You don't need the general case, you need this very specific case. You should of course prove f(x) = f'(x) for f(x) = e^x, using whatever definition of the derivative has your preference.
Not sure what youre trying to say? Or even what part of my post you are referring to exactly? If it's about the last part..: "The derivative of the series" would be d/dx \lim_n\sum_{k=0}^n x^k/k! and all that I said is that you need a reason for why this would equal \lim_n d/dx \sum_{k=0}^n x^k/k!, i.e., a reason for why you are allowed to exchange the order of derivative and limit. This is NOT trivial or true for arbitrary series. It is however exactly what you need to do to reduce the case of differentiating the exponential function to the case of differentiating polynomials, i.e., applying the differentiation to the finite sums.
I know that in some cases in courses for engineers (or even physicists) people pretend like limits can be arbitrarily exchanged. Time to accept that this is just wrong and only ever done because it would be too time-consuming to give the full picture.
Thank you for talking down to engineers and physicists. I am actually a mathematician (well, at least I got my propaedeuse in math, never completed my BSc as I moved to CS which got less hooked up on infinitessimal details )
Anyway, I see your problem, and I think you are simply misunderstanding what I am saying. I do see that there is a more fundamental problem with Simberto's approach, and it's all in the definition of e^x as an infinite series, but it's not in the proof of the derivative. So lets clarify that first:
We define:
f(x) = sum x^n/n! (with n 0 to infinity)
Now:
df /dx = d / dx sum x^n/n! (with n 0 to infinity) (by definition)
Now we have to prove that df / dx = f.
Lemma: d / dx sum x^n/n! (with n 0 to N) = sum x^n/n! (with n 0 to N - 1).
Proof:
Base step: d / dx sum x^n/n! (with n 0 to 2) = d / dx (1 + x) = 1 = sum x^n/n! (with n 0 to 1). (little square box)
Inductive step: Assume d / dx sum x^n/n! (with n 0 to M) = sum x^n/n! (with n 0 to M - 1).
d / dx sum x^n/n! (with n 0 to M + 1) = d / dx (x^(M + 1)/(M + 1)! + sum x^n/n! (with n 0 to M) ) = d / dx (x^(M + 1)/(M + 1)! + d / dx sum x^n/n! (with n 0 to M) = (M + 1)x^M / (M + 1)! + d / dx sum x^n/n! (with n 0 to M) = x^M / M! + d / dx sum x^n/n! (with n 0 to M) = x^M / M! + sum x^n/n! (with n 0 to M - 1) = sum x^n/n! (with n 0 to M) (little square box)
QED.
Theorem:
d / dx sum x^n/n! (with n 0 to infinity) = sum x^n/n! (with n 0 to infinity).
Proof (using Lemma above): lim(N -> inf) d / dx sum x^n/n! (with n = 0 to N) = lim(N - > inf) sum x^n/n! (with n = 0 to N - 1) = lim(N - > inf) sum x^n/n! (with n = 0 to N) = lim(N -> inf) sum x^n/n!
QED
So there we have it: f(x) = f'(x) for f defined as above.
Now, of course, there is a problem, and that is that we haven't defined a^x, only e^x. I believe the usual solution is to simply define a^x = e^(x ln a). And from then onwards it's easy. But I'm not sure that this satisfies what the original question was asking.
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
On September 29 2017 17:23 HKTPZ wrote: Clearly, I didnt manage to ask my question the right way so I will try to rephrase ^^
I know a^x can be rewritten as e^(x*lna) and I know the derivative of e^x is e^x times ln e which simplifies to e^x and I know Leibniz' chain rule.
Im asking: WITHOUT using that knowledge, is there a clever and elegant way to show that lim ((a^h)-1)/h as h goes to 0 is exactly 1/n of lim (((a^n)^h)-1)/h as h goes to 0 implying there is a connection to log.
I can show that lim ((a^h)-1)/h as h goes to 0 exists but Im at a loss trying to show the limit is ln a (what the base is is trivial - the point is to show the connection to logarithmic behaviour)
As in the post above. Please define a^x in a suitably formal manner, and we can take it from there
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
f'(x) = a^xf'(0). Therefore,
I hope this helped you out.
Sincerely, Shalashaska_123
Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.
Also, a super elegant solution, but there is one slight hitch:
"The left side is just the derivative of ln f."
It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
f'(x) = a^xf'(0). Therefore,
I hope this helped you out.
Sincerely, Shalashaska_123
Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.
Also, a super elegant solution, but there is one slight hitch:
"The left side is just the derivative of ln f."
It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.
Thanks, mate.
The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.
On September 29 2017 18:34 Shalashaska_123 wrote: + Show Spoiler +
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
f'(x) = a^xf'(0). Therefore,
I hope this helped you out.
Sincerely, Shalashaska_123
This is exactly what I was looking for. Elegant and simple. Thank you very much for spelling it out to such length.
I feel so ambivalent right now xD happy that it makes sense to me now and sad that I was too stupid to figure it out on my own
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
f'(x) = a^xf'(0). Therefore,
I hope this helped you out.
Sincerely, Shalashaska_123
Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.
Also, a super elegant solution, but there is one slight hitch:
"The left side is just the derivative of ln f."
It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.
Thanks, mate.
The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.
Clearly it's academic, because HKTPZ is happy with it, but the way I interpreted the question was to remove all of the plumbing, which also means you can't just take as known that the derivative of ln x is 1/x, but would have to prove that, which would need some basic knowledge of what the natural logarithm is.. and HKTPZ kinda forbade us from using that knowledge.
It's a bit strange to ask to prove that the derivative of f(x) = a^x is ln(a) * a^(x), WITHOUT using the knowledge that a^x = e^(x ln a), but WITH using the general knowledge of what ln is. One follows quite naturally from the other. But I guess it's homework Placing arbitrary limitations is a great way of figuring out what students actually know
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
f'(x) = a^xf'(0). Therefore,
I hope this helped you out.
Sincerely, Shalashaska_123
Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.
Also, a super elegant solution, but there is one slight hitch:
"The left side is just the derivative of ln f."
It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.
Thanks, mate.
The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.
Clearly it's academic, because HKTPZ is happy with it, but the way I interpreted the question was to remove all of the plumbing, which also means you can't just take as known that the derivative of ln x is 1/x, but would have to prove that, which would need some basic knowledge of what the natural logarithm is.. and HKTPZ kinda forbade us from using that knowledge.
It's a bit strange to ask to prove that the derivative of f(x) = a^x is ln(a) * a^(x), WITHOUT using the knowledge that a^x = e^(x ln a), but WITH using the general knowledge of what ln is. One follows quite naturally from the other. But I guess it's homework Placing arbitrary limitations is a great way of figuring out what students actually know
well fortunately for my odd request, the derivative of ln x does not rely on knowing what the derivative of a^x is, so all's good ^^
I dont think its a completely arbitrary limitation - in fact I think it's very reasonable. But to each their own :D Also it isnt even for homework - it was just something i was wondering about
On September 29 2017 01:39 HKTPZ wrote: Hey y'all.
I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.
By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?
the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7
then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc
this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?
(help me know if my wording was too poor/confusing - English is not my first language)
Hello, HKTPZ.
Honestly, I would do it the way Liebig suggested.
On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))
so the derivative of a^x is ln(a)*a^x
If we can't use that approach, though, then we have to go back to the definition of the derivative.
The function we have is f(x) = a^x. Plug this into the definition.
Use one of the properties of exponents.
Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.
We have here the indeterminate form 0/0, so we can apply L'hopital's rule.
The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.
This is just f'(0).
We stated at the beginning that f(x) = a^x. Substitute it here.
We thus have a differential equation for f(x). Divide both sides by f(x).
The left side is just the derivative of ln f.
Now integrate both sides with respect to x.
Exponentiate both sides so that we just have f(x) on the left side.
Use one of the properties of exponents.
Use a new constant for e^C.
Now substitute a^x for f(x).
We can determine A by setting x = 0. Doing so gives us 1 = A.
All that's left to do is to solve for f'(0). Take the natural log of both sides.
Use the property of logs that allows us to bring the power of the argument to the coefficient.
ln e = 1 and x cancels from both sides. We're left with the value of f'(0).
f'(x) = a^xf'(0). Therefore,
I hope this helped you out.
Sincerely, Shalashaska_123
Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.
Also, a super elegant solution, but there is one slight hitch:
"The left side is just the derivative of ln f."
It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.
Thanks, mate.
The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.
Clearly it's academic, because HKTPZ is happy with it, but the way I interpreted the question was to remove all of the plumbing, which also means you can't just take as known that the derivative of ln x is 1/x, but would have to prove that, which would need some basic knowledge of what the natural logarithm is.. and HKTPZ kinda forbade us from using that knowledge.
It's a bit strange to ask to prove that the derivative of f(x) = a^x is ln(a) * a^(x), WITHOUT using the knowledge that a^x = e^(x ln a), but WITH using the general knowledge of what ln is. One follows quite naturally from the other. But I guess it's homework Placing arbitrary limitations is a great way of figuring out what students actually know
well fortunately for my odd request, the derivative of ln x does not rely on knowing what the derivative of a^x is, so all's good ^^
I dont think its a completely arbitrary limitation - in fact I think it's very reasonable. But to each their own :D Also it isnt even for homework - it was just something i was wondering about
The derivative of lnx kind of depends on having a good idea what lnx is, though. Which requires a good idea of what e^x is, which means you probably know the derivative of e^x, at which point you can use a^x = e^(xlna)
On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.
Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).
You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.
Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..
Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....
The background knowledge you need is different indeed. But I learnt both the same year (how to properly solve a differential equation, or at least know we can, and the power series and what goes with them), which is why I tend to put both those definitions in the same bag. Although, I agree with you that the series expansion definition is more convenient anyway, especially when it comes to complex numbers, but also matrices and other objects.
random statistics question I want to verify. I missed a point for this on my exam. We had to draw a box plot based on some data. They gave us 15 values.
15 20 22 22 24 24 24 28 30 30 34 36 37
so I calculated the quartiles in this way. q2 = median = 24
then remove the median from the data. then find the median of the lower half of the data, and the median of the upper half of the data.
lower half:
15 20 22 22 24 24
median = 22
upper half:
28 30 30 34 36 37
median = 32 (the mean of 34 and 30)
so for q3 I used 32. I got docked a point, they crossed it out and wrote 30.
Did I do something incorrect, or were they wrong to dock me a point? Thanks.
I see 13 , not 15 values. Assuming that there are 15 values, using the simpliest method, then then surely the q3 would be 4th from the highest value, which is 30. If there are 13 values, then it should be 3rd from the highest which is 34. Using another simple method, with 15 values, then q3 will be the mean from the 4th and 5th from the highest and assuming there are 13 values then q3 will be 4th from the highest which is 30. The other methods are probably too complicated for you and you can't seem to count either 13/15 properly anyways.
Statistics is strange. It isn't a universal maths, so you have to follow whatever methodology is taught. So tell us first the method you are supposed to use.
sorry i said 15 when I meant 13, that doesn't mean I can't count properly lol. what does "then then" mean? clearly you don't understand english, since you made a mistake in what you typed.
anyways, they never told us a method we are supposed to use.
What kind of exam tests something that you have not been taught?
Of course there is supposed to be some transfer, but still, exams are for testing whether you learned the stuff they taught you. If they didn't teach you how to do a box plot, why would they test that? It is not like it is something that you can derive from other knowledge either. You either know how to do it or not.
Yeah, at this point I am just wondering what kind of exam is this? They don't teach you methodology to a statistical analysis, and from the sounds of it you can't ask any fellow colleagues or the examiner and there is no teacher.
Also this is a maths thread, not an English thread. Which is why I don't mention your lack of capitalisation.
On October 05 2017 02:34 travis wrote: random statistics question I want to verify. I missed a point for this on my exam. We had to draw a box plot based on some data. They gave us 15 values.
15 20 22 22 24 24 24 28 30 30 34 36 37
so I calculated the quartiles in this way. q2 = median = 24
then remove the median from the data. then find the median of the lower half of the data, and the median of the upper half of the data.
lower half:
15 20 22 22 24 24
median = 22
upper half:
28 30 30 34 36 37
median = 32 (the mean of 34 and 30)
so for q3 I used 32. I got docked a point, they crossed it out and wrote 30.
Did I do something incorrect, or were they wrong to dock me a point? Thanks.
You computed it correctly. The way you solved it is the conventional way that the quartiles are solved, both in AP Statistics in high school as well as in college statistics.
Source: I'm a high school math/ statistics teacher and a college math/ statistics professor. lol.
Furthermore, that 10th data point (the second 30) is only greater than 9/13 of the data, which is 69% of the data and not 75% so... no, it can't be Q3. Your averaged value of 32, on the other hand, is greater than 10/13 of the data, which is 77% (and that's the best average you can get to ensure that you're above 75% of the data, based on there being 13 data points). Your value of 32 does a significantly better job at splitting the data into a lower 75% and a higher 25% than the 30 does.
Thanks plasma! I guess I'll go try to get that point back then.
As for them teaching us certain methodology, they taught us how to make box plot and what quartiles are. The professor then mentioned that there are multiple ways to calculate quartiles. He briefly went over 3 methods, I don't remember which. It seemed like he wasn't too concerned about which method we used.
Your "mistake" was in removing the median from the data. If you include 24 in the upper and lower halves and take the median of both, you'll get 22 and 30. This is basically method 2 from that wikipedia article you posted. The way you did the problem (method 1) is legitimate, but I suppose method 2 is how they want you to compute quartiles in this class.
This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.
On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.
I'm bad at proofs but I'll give it a shot.
Math is consistent. Let's say we had positive integers x and y, and -x * -y was negative. I assume your uncle understands that a positive number times a negative number (say, x * -y) is negative. Then (x + (-x)) * -y = x * -y + -x * -y, which is the sum of two negative numbers and therefore negative. However, x + (-x) = 0, and 0 times any number is 0 (which is neither negative nor positive). Therefore we have a contradiction, meaning -x * -y can not be negative.
On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.
I'm not a mathematician, but this 'debt' example made sense to me:
On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.
Your uncle's point of view has some philosophical apparel... After all are there really negative numbers? Can they be observed in the real world? If they arent something we can observe in the real world, how can we be sure what properties they really "have"?
The mathematical answer to the problem of existence is constructing them. This is done the following way (warning: This might not be helpful at all, I included it for sake of completeness):
<math> The cleanest way to do so is by taking ordered pairs of natural numbers (n,m) and introducing the equivalence relation (n,m) ~ (n',m') iff there ex. a natural number k such that either (n+k,m+k)=(n',m') or (n,m)=(n'+k,m'+k). Then the set of integers can be defined as the set of all equivalence classes and equipped with an addition and multiplication. Identifying (n,m) with the integer solution x to the equation n+x=m is how one usually imagines these integers.
This construction solves the issue of "existence" because once we have established knowledge about natural numbers, this allows us to construct integers using nothing but equivalence classes of natural numbers.
When reading an integer nobody thinks "wow, I guess that's a nice equivalence class of pairs of natural numbers", so after establishing this, one proves that this construct indeed has all the properties one wants from integers and moves on with life. </math>
A more "natural" way to think about it is by thinking of integers as a model (that does not necessarily correspond to anything maginary in the real world) that helps us better formulate and formalize certain problems like debt. Then the point of your uncle reduces to "okay but someone had to come up with this model!", which is true, and the question why the product of two negative numbers is a positive number is answered by "because we want the model to behave that way".
Why do we want that? Well, we want to have the laws of distributivity, associativity and commutativity of integers as well as -1*x=-x... So for two positive integers x,y their corresponding additive inverses -x,-y have to allow -x*-y=(-1)*x*(-1)*y=(-1)*(-1)*x*y=x*y.
So while there is no real universal "truth" about why this is the case, we want to have a set of integers that acknowledges these laws, so we made them in a way that they do. Arguing that negative times negative equals positive does not necessarily have to be the case boils down to arguing that at least one of these laws should be omitted. It seems reasonable to assume these laws for what we are trying to model though.
Using the consistency of the results in the lines as an argument that the negative * positive is positive, and negative * negative is negative. This is obviously not a complete proof, but a good argument nonetheless.
Also, i would like to mention that this thread is not only for maths students or professors. Anyone that has any problem with math is very welcome here!
And according to this, your uncle is actually kind of right. (-1)*(-1) = 1 is a convention that has been decided by people, but any other possible convention breaks some part of maths that is important to us, because we like maths to be consistent.
This sounds like saying 1+1=2 because somebody decided to make it that way. Whilst in one sense it is true linguistically, in many senses it also isn't true in that the concept itself is real. A multiple of two negatives is a positive is because reality works that way. Whilst someone or a group had to decide it works that way, it was decided it works that way because it indeed can only work that way. It's like the concept of imaginary numbers. (Square root of -1) Whilst the number isn't "real" as in you can count it with your fingers, it is mathematically "real" and exists as a "concept". The concept and its applications are absolutely useful in describing the real world and your computer wouldn't run if people didn't understand imaginary numbers. (Which aren't imaginary at all btw, it's just the name attached and we ran out of words to describe mathematical constructs).
Edit that's a bad example. Take fractions. You cannot count fractions with your fingers. But you can add and subtract them, divide and multiply them and otherwise use them as if they are whole numbers that you can physically count with your fingers, using methods and notations that are fairly easy to do. Now, it is true that fractions exist and everything you do with them are that way because someone or a group of someones or independent, seperated groups decided that fractions work and are described in that way, but ultimately they decided it works that way because that's just how reality operates. You don't cut a cake in half and put it in the fridge and find out that it is now 3/4 of a cake. Unless your uncle ate your cake and brought a new one and ate a quarter of it.
Or take pi. pi was worked out by physically measuring the circumference of a circle and comparing it with its diameter (or radius). Diferent groups in the ancient world all independently "discovered" pi. They all decided what pi was (some rather more accurately than others). Now it is true that pi is what some guy a while ago made it that way, but in reality it can only be that value (or around that value depending on how accurate you want to be) because that is simply how reality is.
Of course you have to question the motive of your uncle. In one sense he is broadening your mind. To question reality. To inquire about the world about you. To introspect on the nature of truth. To set you upon the path of self knowledge that cannot be taken away from you. In another sense he sounds like an idiot. Ask him why 1+1=2 and you will find out which is which.
Is your unlce Leopold Kronecker? It is a rather fundamental question wether math is discovered or invented. I do not agree with most of Dangermousecatdogs examples, heck I don't agree with Leopold Kronecker. Math is like the hammer we invented to cut cheese, as in Pi is real, yet in our conventional number system transcendental. There are different ways to set up math without it contradicting itself. Buzzwords are hyperbolic geometry, rational geometry, Inter-universal Teichmüller theory. I don't know anything about these things, yet in my ignorance I am willing to imagine, that two negatives multiplied does not give you a positive in all shapes math can take.
Is cheesecake a pie? Can it be a cake? What if I cut the pie in half with a hammer, put it in a fridge and took half away and multiplied it with another negative half pie? Do I recieve a positive quarter pie? I now have 3/4 of a cake?
On October 05 2017 21:28 Dangermousecatdog wrote: Is cheesecake a pie? Can it be a cake? What if I cut the pie in half with a hammer, put it in a fridge and took half away and multiplied it with another negative half pie? Do I recieve a positive quarter pie? I now have 3/4 of a cake?
If you multiple two negative half pies, you get a positive quarter square pie.
On October 05 2017 21:28 Dangermousecatdog wrote: Is cheesecake a pie? Can it be a cake? What if I cut the pie in half with a hammer, put it in a fridge and took half away and multiplied it with another negative half pie? Do I recieve a positive quarter pie? I now have 3/4 of a cake?
If you multiple two negative half pies, you get a positive quarter square pie.
I'm having a little bit of trouble understanding the steps in the example (mainly because a "negative half pie" isn't an actual thing, and I don't understand why we're comparing cakes to pies), so let me try to come up with an example that helps illustrate the "negative * negative = positive" issue. Obviously, it's pretty hard to come up with a concrete example because the act of negating a value (owing, lowering, lessening, reversing, opposing, etc.) is usually a verb rather than a noun. For that reason, it might not make sense to assume that we can take a negative tangible object and multiply it by another negative tangible object, but we can still describe a process where those negative labels might make sense in context.
- The act of removing*** an object, for example, can be represented as negating that object. If we take the reverse*** of that decision, we're adding an object. Therefore, we have a double negative which serves the same purpose as a positive.
- The opposite*** of a temperature decreasing*** means that a temperature is increasing.
- If I owe*** you money and then the debt becomes reversed***, then I'll be receiving money from you.
***All of these terms can be represented as a negative sign in some context, computationally or conceptually.
Therefore, we may choose to represent these situations as a double-negative. While they may be "simplified" into a positive overall, retaining the double negative may actually help describe the two-part negation context that would otherwise be lost by merely seeing a positive.
The same line of reasoning works for whether it's establishing that x - - y = x + y or that -x * -y = xy, because once we understand that a double-negative nets you a positive, the latter can be rewritten as -1*x*-1*y = (-1)(-1)(xy) = (1)(xy), unless someone has a further issue with multiplication or commutativity.
Multiplying by a positive number represents stretching or shrinking. Multiplying by a negative number represents stretching/shrinking together with a 180* rotation.
So if you multiply by two negative numbers, the two 180* rotations make a 360* rotation = no rotation. You are just left with the stretching/shrinking, which is a positive number.
I don't know if that would be a formal proof per se, but that's what I wrote above (the factoring out of the two -1's) so I definitely think it's a pretty decent justification.
On October 06 2017 00:29 Day_Walker wrote: A geometric argument:
Multiplying by a positive number represents stretching or shrinking. Multiplying by a negative number represents stretching/shrinking together with a 180* rotation.
So if you multiply by two negative numbers, the two 180* rotations make a 360* rotation = no rotation. You are just left with the stretching/shrinking, which is a positive number.
No, that's silly, that's like saying the reason the earth moves the way it does is because shadows on earth have to look the way they do - you have it the other way and it's a consequence of more fundamental things.
You're better off thinking about multiplying by a constant moving along a line in a x-y coordinate system, and you can stretch at any point on this line (going to infinity on both sides)
On October 05 2017 20:15 Dangermousecatdog wrote: This sounds like saying 1+1=2 because somebody decided to make it that way. Whilst in one sense it is true linguistically, in many senses it also isn't true in that the concept itself is real.
1+1=2 is merely an abstraction with no meaning in the real world. however, if i have 1 apple and then i acquire a 2nd apple.. i now have 2 apples. there is the real world application of 1+1=2.
also, "pi" isn't really "math". "pi" is a constant.
On October 06 2017 00:29 Day_Walker wrote: A geometric argument:
Multiplying by a positive number represents stretching or shrinking. Multiplying by a negative number represents stretching/shrinking together with a 180* rotation.
So if you multiply by two negative numbers, the two 180* rotations make a 360* rotation = no rotation. You are just left with the stretching/shrinking, which is a positive number.
No, that's silly, that's like saying the reason the earth moves the way it does is because shadows on earth have to look the way they do - you have it the other way and it's a consequence of more fundamental things.
Right, in this case we get to choose how the earth moves, and we can make our choice based on the shadows we want to see. The same thing is actually going on in the algebraic proof that travis gave: we want certain properties (shadows) like distributivity, and for those properties to hold we see that -1 * -1 must be 1 (the earth must move a certain way).
On October 06 2017 09:57 FiWiFaKi wrote: You're better off thinking about multiplying by a constant moving along a line in a x-y coordinate system, and you can stretch at any point on this line (going to infinity on both sides)
I'm not sure I understand you here. Are you looking at the graph y = kx? What do you mean by stretching at a point?
The reason I like thinking about rotations is that it fits with arithmetic in C. For example if we view -1 as a 180* rotation, it makes sense that -1 should have two square roots: 90* rotation clockwise, and 90* rotation counterclockwise.
I think his argument is along the lines that pi is only a number. Saying a number is maths is slightly weird, but not entirely. I would say that "3 is maths" doesn't sound quite right. Obviously numbers are a part of maths, and if you get into the definition of natural numbers via peano axioms, it gets quite mathy. But i still don't think "3 is maths" is a very sound statement.
Basically the same goes for pi. Pi is something you use in geometry, and geometry is maths. But pi itself is just a number.
Granted, the whole rant isn't very exact and more feelings-based, but i can understand why one would say that pi isn't maths.
Pi itself is not just a number. All constants are mathematically interesting. All constants are "maths". It's like saying e is just a number. It's a number with meaning and is one of the foundations of trig and geometry. It's a number that has to be calculated using algorithmic calculations. It is also a good example or something that at face value is because some guy made it that way, but if you look into it deeper it is both true and not true at the same time, just like multiplying two negatives make a positive.
On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.
A mathematical proof only works if you have defined your axioms. As was mentioned before a completely rigorous proof would involve the construction of integers and the respective definition of the order relation (starting from the axioms of some set theory) which would be overkill right now. Instead we take the following rules (and the typical rules of calculating) as axioms themselves, which I doubt your uncle would take issue with: For all integers (or rational numbers or real numbers) a,b,c it holds A0 either a<b, a=b or b<a A1 (a<b and b<c) implies a<c A2 a<b implies a+c<b+c A3 (a<b and 0<c) implies ac<bc
Statement 2. (c<0 and a<b) implies bc<ac Proof: c<0 implies 0<-c by Statement 1. It follows from A3 -ac<-bc. Now add ac on both sides (via A2) to arrive at 0<ac-bc. Finally add bc on both sides (again via A2) and arrive at bc<ac.
Corollary (a<0 and c<0) implies 0=0*c<ac.
Another nice corollary is 0<1: Assume this is wrong, i.e. 1<0 (because 1=0 is excluded). Then a<b implies 1*b<1*a by Statement 2 which is a contradiction of A0.
[EDIT]: I just realized that it's nicer for your purpose if you substitute A3 with A3' (0<a and 0<c) implies 0<ac
It shouldn't be hard to convince anyone that the product of two positives remains positive. From A3' you get A3 in the following way: a<b implies (by A2) 0<b-a which implies (by A3') 0<bc-ac for 0<c from which it follows ac<bc (again by A2).
Pi and its relation to a perfect circle illustrates neatly why it's only tangentially related (yet still very useful) for our understanding and applications of nature. You can't create a perfect circle in reality, you'd need infinite precision for that (infinitesimal small sizes), which reflects neatly in the constant. I guess a similar story for "e" can be used too, but I haven't thought about it nearly enough and its uses are much more esoteric to me so no claim about that one!
I'm not trying to diminish the beauty of math, by the way, just trying to argue how man made (or how conceptual) maths actually is. It's more of an abstract thing than a real thing imo; and that in itself is awesome if you think about it.
Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
That's why he wrote it sounds plural, not it is
Also proving fundamentals without defining axioms,especially when using derived concepts which are often defined using said fundamentals seems pointless.
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
thanks for the english lesson lol, now how about you take some comprehension lessons
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
That's why he wrote it sounds plural, not it is
Also proving fundamentals without defining axioms,especially when using derived concepts which are often defined using said fundamentals seems pointless.
Right, but that's an inconsistency with what ahswtini said, not with what I said. He said that since mathematics "sounds plural" he thinks that the abbreviation math should "be plural". Sounding plural doesn't mean something is plural, and if an abbreviation is supposed to represent the same thing, then shouldn't the abbreviation be the same singular or plural type? It matters with subject/ verb agreement, etc.
On October 06 2017 23:27 Uldridge wrote: I literally don't know what it is, so I'm always confused about how to write it. I guess that's why I write both math and maths lol
That's fair I think consistency is probably better.
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
thanks for the english lesson lol, now how about you take some comprehension lessons
Not sure why you're so offended, but your response doesn't help clarify anything.
The OP says "So what is welcome here? Anything math related! Random questions, theory, homework discussion, anything." This is definitely a related random question, although apparently it's upsetting you >.> I didn't mean to offend you; I was just curious as to conventional spelling preferences.
If the pair of you can just stop with the little love spat, maths and math, the spelling and spoken difference probably occurs for the same reason as Aluminium and Aluminim...
...Americans are too lazy to speak English properly.
On October 07 2017 02:08 Dangermousecatdog wrote: If the pair of you can just stop with the little love spat, maths and math, the spelling and spoken difference probably occurs for the same reason as Aluminium and Aluminim...
...Americans are too lazy to speak English properly.
Aluminum*
But seriously, it was an honest, math-related question. Other people responded with actual replies; I'm not sure why ahswtini took it so personally. People are more than welcome to move on to a different question, and I don't think anyone is going to be trying to spell-check everyone else who chooses to spell it "math" or "maths"
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
It may not be plural in English, but it is in Spanish. It's las matemáticas, not la. So given that Spanish is closer etymologically to the common Latin root "ars matematica" (and the Romans simply stole the Greek word for learning there), treating it as a plural if it has the s is probably best.
However, English in general has a weird fetish of replacing the original a with an s (and keeping it as a singular word): mechanics, electronics, analytics, physics.. all from originally Greek words ending in an a sound (especially after passing through Latin).
Finally, your example is wrong if trying to prove mathematics is not plural (not saying it is: English is a fucking weird bastardization of a language). "My favorite topic is birds/trees/cars". These are all plural, but it's not "my favorite topics are ..." In a sentence like this. Subject works in the same way..
On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:
In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks
i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural
Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.
It may not be plural in English, but it is in Spanish. It's las matemáticas, not la. So given that Spanish is closer etymologically to the common Latin root "ars matematica" (and the Romans simply stole the Greek word for learning there), treating it as a plural if it has the s is probably best.
However, English in general has a weird fetish of replacing the original a with an s (and keeping it as a singular word): mechanics, electronics, analytics, physics.. all from originally Greek words ending in an a sound (especially after passing through Latin).
Finally, your example is wrong if trying to prove mathematics is not plural (not saying it is: English is a fucking weird bastardization of a language). "My favorite topic is birds/trees/cars". These are all plural, but it's not "my favorite topics are ..." In a sentence like this. Subject works in the same way..
English is definitely an inconsistent language lol, and those other Greek words are good examples too. And I didn't know about the relationship to Spanish; thanks for sharing that!
Depending on the dictionary (which just makes the issue even less consistent), it seems that "mathematics" can also be seen as technically singular as well, yet also function as either singular or plural: http://www.thefreedictionary.com/mathematics
So I guess the bottom line is: "Math is ___" or "Maths is ___" are both acceptable conventions based on location (and that the English language is often times silly and inconsistent). Understood. Thanks everyone for responding
I am not a mathematician. I am reading Matt Parker's "Things to make and do in the fourth dimension" and he just mentions that. I look up wikipedia and they just mention that and I wonder how to show that. Intuitivly I would have guessed there are infinitly many.
edit: Oh, I got it. Fun question though, don't you think?
I get why there would be a finite number. There are a finite number of 10-digit polydivisible numbers, and each one can create either 0 or 1 polydivisible numbers with the same starting digits. Repeat, and the total number of polydivisible numbers with n+1 digits is fewer than or equal to the number of polydivisible numbers with n digits (n≥10). I'm not sure how to show this eventually reaches 0, though.
On October 09 2017 03:51 The_Templar wrote: I get why there would be a finite number. There are a finite number of 10-digit polydivisible numbers, and each one can create either 0 or 1 polydivisible numbers with the same starting digits. Repeat, and the total number of polydivisible numbers with n+1 digits is fewer than or equal to the number of polydivisible numbers with n digits (n≥10). I'm not sure how to show this eventually reaches 0, though.
Presumably you show convergence with an inductive proof with a series. However, that just shows there is a finite number. Figuring out that that is 20,456 takes further work. I don't know much about number theory, though, so count me out.
I'd guess that the simplest way to figure out how many there are is to just find them all, because it is very easy to prove that you have found the last one. (Test all of the polydivisible numbers with n digits, test 0-9 as digit n+1 and see if it is divisible by n+1. If not, there are no more of them ever afterwards.)
With computers, it shouldn't be hard to just try numbers until they stop working.
the problem was to find the distance from y to the plane in R^3 spanned by u_1 and u_2 where y = [5; -9; 5], u_1 = [-3; -5; 1] and u_2 = [-3; 2; 1].
Can someone explain why step 5/6 cares about squaring the distance? Why does that matter? They are just squaring the distance and then immediately taking the square root again.
Isn't the entire point to find the length of y - yhat ? My book's examples are doing the same thing. I feel like there must be a point but I don't understand what it is.
Squaring and then square rooting is typically how you find the absolute value of a number, so that negative differences and positive differences are still represented in the final summation, rather than them cancelling out ahead of time. That's typical in distance formula problems and other kinds of problems where you need to add up variations, like with standard deviations in statistics.
But length of a vector, ||v|| means precisely to take v*v and then square root it, right? So length will always be positive. So squaring ||v|| and then square rooting it is redundant. (well, in R^n, right?)
In R^n you are right, this is redundant. ||x-y|| is sufficient. If I had to guess why this is written this way: In other applications the 2-norm (euclidian norm) is used in combination with a (...)^2 to cancel out the square root, which is where the author might have the notation from. Obviously to get the distance you then have to re-add the root.
You calculate ||x||² first, because that just involves not doing the square root when doing the norm. Basically, you ||x||, then square, then root, which would indeed be silly.
They calculate ||x||² , which just means taking the squares of all components (without root), and then in the next step they take the sqrt of that.
There is no deeper mathematical reasoning for this, it is just a different way to write down the stuff that you would already do, namely squaring all components, adding them up, and then taking the square root. So they don't calculate Sqrt(v*v) and then square it, and sqrt it again. That would be silly. They calculate v*v=||v||², and then sqrt it to get the value of the distance.
The reason this style of writing is chosen is probably that in a lot of cases, you don't actually need to take the square root in the end. An example would be sphere equations in R³. Sure, you could take the root on both sides of the equation, but that just makes the equation more ugly and more annoying to deal with.
that’s almost how i first wrote it down. but not quite- and i’m still not reasonably sure i understand why that’s the case. it looks like you’ve set it up as a calculation of a weighted percent increase. but i don’t understand why that works.
5000 games came to 49% success, so 0.49 * 5000 = 2450 successes occurred. In your case, 500 games came to 44% success, which is 0.44 * 500 = 220 successes. That means that 2450-220 = 2230 successes came from other people in 5000 - 500 = 4500 games. The success rate of that is 2230 / 4500 = 49.556%.
Pretty easy. You know your own win percentage and how many games that was. You also know that those games are a subset of a bigger set, for which you also know the win percentage. You want to know the win percentage of the bigger set without your own games. Or, in equation form:
THANK YOU. i had been doing exactly that but couldn’t make the logical leap to that one last step.
On October 19 2017 23:41 Acrofales wrote: Pretty easy. You know your own win percentage and how many games that was. You also know that those games are a subset of a bigger set, for which you also know the win percentage. You want to know the win percentage of the bigger set without your own games. Or, in equation form:
4500*x + 500*.44 = 5000*.49
Solving for x gives you the earlier equation.
yeah, this is exactly how i started except for the 5000. i was missing the realization TT walked me to in order to figure out my problem.
I have a problem in my linear algebra class, I don't understand how to solve it. I find it very confusing.
The question says to give an example of a compact set A, and a closed set B in R^2 such that ((conv A) intersection (conv B)) is non-empty, but A and B cannot be strictly separated by a hyperplane.
Eventually I just looked for an answer online and I found this:
I don't understand why conv A would be the segment [-3, 3]. Wouldn't it be the segment (-10, 10) ? And isn't A in this example not even closed?
Okay, so maybe someone can explain my misunderstanding there. But I guess, if I am correct, I could use -10 < x <10 from his example. And if I do that, then I think this answer works, there should be no hyperplane that strictly separates A and B, and (conv A intersection conv B) should be empty.
Sorry the page doesn't display properly for me, so I can't help you out with this concrete example. But the point of this problem seems to be that conv(B) of an unbounded closed set B is not necessarily closed. If you take for such a B any point on the boundary of conv(B) (that is not in the interior of conv(B)), you get your solution.
Example: B:={(x,|1/x|):x not 0}, A:={(0,0)}. As a graph B is closed and never touches the x-axis but conv(B) contains lines parallel to the x-axis that get arbitrarily close.
[EDIT]: Finally I was able to open the page. Funnily the example constructed there is basically the same as mine (it is the most simple one I can think of). It appears [-3,3] is a typo, it should be [-10,10] like you said but you must define -10<=x<=10 otherwise the set is not closed. Or simpler still, just take my example where A is a single point on the x-axis.
Hey everyone - theres this one problem I cant seem to figure out.
Here's what we are given:
Grad f(2 , 5)=(1 , 3)
h(x,y) = f(2y+x^2,2x-y)
We are asked to find the partial of h with respect to x
My thoughts/confusion:
Isn't grad of a function supposed to produce a vector valued function? Notation-wise I am perplexed as to which x we are meant to find the partial with respect to
I am really at a loss - either i just really dont get this or theres something about the notation that isnt clear
If any of you could help me see the light, thatd be greatly appreciated
On November 16 2017 19:40 amyamyamy wrote: Hey everyone - theres this one problem I cant seem to figure out.
Here's what we are given:
Grad f(2 , 5)=(1 , 3)
h(x,y) = f(2y+x^2,2x-y)
We are asked to find the partial of h with respect to x
My thoughts/confusion:
Isn't grad of a function supposed to produce a vector valued function? Notation-wise I am perplexed as to which x we are meant to find the partial with respect to
I am really at a loss - either i just really dont get this or theres something about the notation that isnt clear
If any of you could help me see the light, thatd be greatly appreciated
Hello, amyamyamy.
You're correct: grad of a function gives a vector-valued function. The first given equation we have is called a vector equation which gives rise to a system of equations. I'll show you. We can rewrite the left side as a vector (the definition of gradient) evaluated at x = 2 and y = 5.
Distribute the evaluate symbols to each of the components.
The fact that the vectors are equal means each component must be equal. This gives us the system of equations I mentioned earlier.
We'll make use of this system in a bit. For now we can start to look at the second given equation.
We don't just have x or y in the arguments of f but actually functions of x and y, so we make substitutions. We can use whatever variables we like for them. I'll use a and b.
So we have
Now differentiate both sides partially with respect to x. We can use the chain rule for multivariable functions to write df/dx in terms of the new variables, a and b, as shown below. Substitute the derivatives of a(x,y) and b(x,y).
Setting x = 2 and y = 5, we can use the first equation of the system.
We have here one equation in two unknowns, df/da and df/db. To obtain a second equation for them, we consider df/dy. Use the chain rule again to write it in terms of the new variables, a and b.
Setting x = 2 and y = 5, we can use the second equation of the system. (x and y aren't present, so it doesn't change anything.)
Solve this system of equations for df/da and df/db with substitution or elimination, whichever you prefer.
Plugging df/da and df/db into the formula for dh/dx, we get
Therefore,
I hope this helped you out. In order to avoid having to think about what the "real" x is, just make a substitution like I did, and you should be fine.
Hello! I got an idea: Let P be a polygon. SP denotes a polygon derived from P such that its sides connect ceters of P's neighbouring sides. For each integer n find a construction of P such that S^iP is not convex for all i from 0 to n, and S^{n+1}P is convex.
Perhaps you've discussed that already so if you have I apologize.
There's a frog riddle going around where you have two frogs that are equally likely to be male or female, and you're trying to find out the chance that at least one of them is female. You have 75% (MM, FM, MF, FF). You now learn that one of them is male and you don't know which one, so the riddle says that it becomes 66% likely (MM, FM, MF)
Now if you knew that the first one was male instead of either one, the chance would be 50% (MM, MF).
The initial riddle is flawed because you find out the information that one is male because males croak and you heard a croak, so you know one is male; however it doesn't take into account that if only males croak, you're more likely to hear a croak from a set of MM than a set of MF or FM, because both of them could have croaked at that moment rather than just one.
I want to discard that and ask, shouldn't the information that we are aware one of the frogs is a male be introduced in the equation? something like "sure male" (sM), then we would get sMM, MsM, sMF, FsM) and a 50% chance?
If you have a detector that can only detect if there is a croak, but not the volume of croak, and if male frogs certainly croak, the riddle works.
Otherwise, you have to do pretty complicated stuff with regard to calculating the probable amount of male frogs based on the amount of croaks that you detect, and the average amount of croaks per second a male frog emits. If you want to solve something like that, more information is needed.
You can use a different version of the riddle and just ask a women of whom you know that she has two children whether she has a son, avoiding the croaky problem.
On November 29 2017 03:17 Nebuchad wrote: I want to discard that and ask, shouldn't the information that we are aware one of the frogs is a male be introduced in the equation? something like "sure male" (sM), then we would get sMM, MsM, sMF, FsM) and a 50% chance?
sMM, MsM, sMF, FsM : yes
50% : no, both sMM and MsM map to the MM category that had a 25% initial probability and 33% probability knowing that at least one frog is male (so sMM 16.5, MsM 16.5, sMF 33 and FsM 33).
That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely) then probability of something being true is (number of outcomes where the thing is true) / (total number of outcomes) This is how you get 3/4 = 75% for the first one. There are 4 outcomes in total and 3 have at least one female. If you are told that at least one is male then you have only 3 total outcomes, of which 2 include a female. Hence the probabililty is 2/3.
It is called conditional probability. The general formula is P(A|B) = P(A and B) / P(B) where P(A|B) means the probability of A being true given that we know B is true.+ Show Spoiler [Justification] +
If you are thinking about random outcomes then the formula is easy to show. If you let N be the total number of outcomes and N(A) be the number of outcomes for which A is true then P(A) = N(A) / N and P(A|B) = N(A and B) / N(B) = ( N(A and B) / N ) / ( N(B) / N )
In general we define conditional probability this way.
So in this case you get P(one female given at least one male) = P(one female and one male) / P(at least one male) = (2/4) / (3/4) = 2/3
I am not entirely certain how you have become confused with the 'sure male' thing but I think you have overlooked that all outcomes are not equally likely because the sex of the two frogs is no longer mutually exclusive. Hence you cannot do probability in the same way. If the first frog is female then the second frog must be male.
An alternative way of looking at this is to say that if only one frog is male (FM or MF) then that male frog must be the sure male but if both are male then there is equal probability that it is the first or second, hence P(FsM) = P(sMF) = 2P(sMM) = 2P(MsM) Also knowing that P(sMM) + P(MsM) + P(FsM) + P(sMF) = 1 gives you P(FsM) = P(sMF) = 1/3 and P(sMM) = P(MsM) = 1/6.
On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)
Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.
Other than that, you analysis of the situation seems to be good.
On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)
Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.
Other than that, you analysis of the situation seems to be good.
This comes down to your definition of "random". The definition of "random" that I'd seen used in probability theory is "all outcomes are equally likely". So the outcome of 1 D6 is random (6 outcomes, all equally likely) but the sum of 2 D6 is not (11 outcomes, probability skews towards the middle) although the full set of 36 outcomes is random. A D6 always has 6 outcomes even if you are grouping multiple outcomes together. You cannot restrict a D6 to two outcomes (1 and not 1). If you had a cube with a "1" on one side and "not 1" on five sides then it is not random in that strict sense.
The fuzzier definition of "random" as "unpredictable" is not mathematically useful, at least in probability theory. Probability theory only makes sense if outcomes are unpredictable.
On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)
Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.
Other than that, you analysis of the situation seems to be good.
This comes down to your definition of "random". The definition of "random" that I'd seen used in probability theory is "all outcomes are equally likely". So the outcome of 1 D6 is random (6 outcomes, all equally likely) but the sum of 2 D6 is not (11 outcomes, probability skews towards the middle) although the full set of 36 outcomes is random. A D6 always has 6 outcomes even if you are grouping multiple outcomes together. You cannot restrict a D6 to two outcomes (1 and not 1). If you had a cube with a "1" on one side and "not 1" on five sides then it is not random in that strict sense.
The fuzzier definition of "random" as "unpredictable" is not mathematically useful, at least in probability theory. Probability theory only makes sense if outcomes are unpredictable.
That is a weird definition of random which i have never heard before. Might have to do with language possibly, maybe "random" doesn't 100% map onto "Zufall", but i don't really think that that is the case. I just looked up the first two books on probability theory available as ebook by my university library (Klenke, Bhattacharya), and none of them seemed to use that definition of random. (They don't actually define random at all, because that isn't actually necessary if you talk about probability theory, as you are just talking about measuring subsets of a full event space by some probability distribution function with a few basic attributes that make it behave like you expect probabilities to behave, based on kolmogorov axioms expanded to fit larger event spaces.)
What you call random i have always heard titled "Laplace assumption" in my probability classes. I don't think you are correct.
On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)
Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.
Other than that, you analysis of the situation seems to be good.
This comes down to your definition of "random". The definition of "random" that I'd seen used in probability theory is "all outcomes are equally likely". So the outcome of 1 D6 is random (6 outcomes, all equally likely) but the sum of 2 D6 is not (11 outcomes, probability skews towards the middle) although the full set of 36 outcomes is random. A D6 always has 6 outcomes even if you are grouping multiple outcomes together. You cannot restrict a D6 to two outcomes (1 and not 1). If you had a cube with a "1" on one side and "not 1" on five sides then it is not random in that strict sense.
The fuzzier definition of "random" as "unpredictable" is not mathematically useful, at least in probability theory. Probability theory only makes sense if outcomes are unpredictable.
Would you mind specifying the level of the course where you heard this definition? Maybe it is a language problem, but this looks very weird to me. From my impression (I've taken courses in probability theory up to the point of the Black Scholes formula, even though this was a few years back), the word "random" is almost exclusively used in conjunction with other terms in a precise mathematical contest, mostly such as "random variable". What you call "random" is a specific property which a probability measure may or may not have.
edit: Well I should read all new posts before posting.
I was trying to avoid making it too complicated, and this may be a language issue. The word 'random' has a strict and a loose meaning in English. The looser version means that there is some element of chance involved, it wasn't picked. In expressions such as "random variable" it is being used in this looser sense.
The stricter definition is that all possible results are equally likely (if in discrete probability) or has a uniform distribution (if in continuous probability). This is what is meant when talking about a random die, a random sample, or a random number generator.
I doubt either definition appears in a textbook, particularly in no undergrad or above level textbook. You would be better off looking in a normal English dictionary. Both definitions are used in- and outside mathematics, although in my experience, at uni level the stricter version is less common.
While "Zufall" does map onto "random" for some things it is far from perfect. I did not take a course on probability while in Germany so I am not certain on all the German terms in probability, but in conversational usage I would not translate "Zufall" or "zufaellig" as "random". A better translation of "Zufall" would be "accident" or "coincidence" or "fluke".
I got annoyed by Simberto's use of "obviously". It is a pet peeve of mine when people use "obviously" (or "clearly" or any other such word) when writing about maths. It adds nothing but can be condescending. If the reader understands then they don't need to be told it is obvious, and if they don't understand then telling them it is obvious will not help them to understand.
In my experience "obviously" when used in a math lecture means "I don't want to prove this, do it yourself if you don't see why it is true". And i can see why it would annoy you, in fact it is a pretty common joke among the students of the maths department at my university to just randomly use "obviously" for stuff which very clearly is not obviously (Or to try to get by with not proving stuff that actually should be proven by saying it is obvious ) to mock the professors which use it in the way you mentioned above.
I didn't intend to use it in that way, though, i wanted to use the more common meaning of "I don't think anyone would disagree"
And i think that sometimes telling someone that something is obvious may help with understanding. It means that you have to look for a simpler explanation as to why it is true. As soon as you start to need to use too many steps to figure it out, you are probably not following the right course when looking why something obvious is obvious. Most of the cases, especially when used in text books, it is infuriating. It usually takes me 20 minutes to understand the actual proof, and then another two hours to figure out why something which is called obvious is in fact obvious. And then i notice that instead of just writing that it is obvious, the author could also have use the same amount of words explaining the thing, and get pissed off.
On December 06 2017 09:43 travis wrote: This is probably way beyond anything I should be asking, but,
If I have N points on a cartesian plane, is there a way to construct a function that describes either:
1.) all the lines between all the points
or
2.) the polygons that can be created by the points (I guess this would require more than one function....)
thanks.
I think you should specify that question. Are you looking for an object in the shape of
f:{subsets of R^2 with N elements} -> R^2 (a1,....,aN) |-> (all points that are contained in the lines between a1,.....aN)
Because at least in theory, this is rather easy. You can define f((a1,....,aN)) by f((a1,....,aN)) ={x in R^2| there are ai, aj, m in [0,1] such that x= ai +m*(aj-ai)}
Hope this can be understood. Havent learned how to embed tex here.
The question is what should map onto what with that function. If you have your points in R², and want to have a function that maps x coordinates onto the y coordinates of the lines/polygons that you have described, that is generally not possible. The reason for that is that a function of R-->R always maps one x onto exactly one y, but the lines you describe will in most cases have more than one point with the same x coordinates, but different y coordinates.
If you go more abstract, and first define a space of all possible lines in R², (Like for example by describing a line via starting and ending point, making the line basically an object of R^2xR^2), lets call it A
and then map a set of N points in R^2 (Which is an object in (R^2)^N ) onto an appropriate overspace of A, You will probably need something like A^((N-1)*N/2)
(N-1)*N/2 is the amount of lines between N points here if i am not mistaken
You can create a function f: R^2N ---> A^((N-1)*N/2) which maps (x1, y1) x....x(xN, yN) onto Line(x1,y1,x2,y2)xLine(x1,y1,x3,y3)x...xLine(xN-y,yN-1,xN,yN)
The same should be possible with polygons.
Thinking about this, this is probably way too stupid, and you should go with the above described map of R^2 ---> R^2, which also works.
Indeed, I should've put P(R^2) (the set of all subsets of R^2) as codomain instead of R^2 itself. I'm a bit in a hurry how, will comment later when I have more time.
Not quite sure what Simberto is on about. The input is clearly not R^2, but rather R^4 (x1, y1, x2, y2) for two points, R^6 for 3 points, etc., and the output are "characterizations of linepoints (or polygons)". Exactly what that's supposed to be beyond a set of points in R^2 I don't know, but lets be less restrictive and map them to equations, then we can do something like:
F(x1, y1, x2, y2) = [(y-y1)(x2-x1)=(y2-y1)(x-x1)]
if you can guarantee x1 and x2 are unequal you can rewrite this as a standard linear function, but I will leave it like this for now.
This is a general function, that for any two points will output a linear equation characterizing the line that goes through them. Now for N points, you have N^2 - N different combinations of two points, and thus you will get N^2 - N different equations as your output. I'm not quite sure what the exercise is supposed to be, but lets roll with it.
For the rest of the question, I am just as confused as ^^ those guys. You can of course, create an algorithm to find all possible triangles, quadrilaterals, etc. etc. that can be constructed for a set of N points. I'm not sure what precisely the time complexity is, but something like O(n^(n/2 +1)) is a good starting estimate: this is superexponential. So you should probably rethink why you need all possible polygons that can be drawn using N points in a plane.
As for the function representing this construction? It'd be some monstrosity of the kind f: R^2N -> P(R^2N)
On December 06 2017 20:50 Acrofales wrote: Not quite sure what Simberto is on about. The input is clearly not R^2, but rather R^4 (x1, y1, x2, y2) for two points, R^6 for 3 points, etc., and the output are "characterizations of linepoints (or polygons)". Exactly what that's supposed to be beyond a set of points in R^2 I don't know, but lets be less restrictive and map them to equations, then we can do something like:
F(x1, y1, x2, y2) = [(y-y1)(x2-x1)=(y2-y1)(x-x1)]
if you can guarantee x1 and x2 are unequal you can rewrite this as a standard linear function, but I will leave it like this for now.
This is a general function, that for any two points will output a linear equation characterizing the line that goes through them. Now for N points, you have N^2 - N different combinations of two points, and thus you will get N^2 - N different equations as your output. I'm not quite sure what the exercise is supposed to be, but lets roll with it.
For the rest of the question, I am just as confused as ^^ those guys. You can of course, create an algorithm to find all possible triangles, quadrilaterals, etc. etc. that can be constructed for a set of N points. I'm not sure what precisely the time complexity is, but something like O(n^(n/2 +1)) is a good starting estimate: this is superexponential. So you should probably rethink why you need all possible polygons that can be drawn using N points in a plane.
As for the function representing this construction? It'd be some monstrosity of the kind f: R^2N -> P(R^2N)
I wanted to basically say the same thing as you. You can't solve this as the graph of a function R-->R
You can define a function as Mafe did, which maps from P(R²)--->P(R²).
Or you can define a function as i did, from R^2N---> R^((N-1)N/2). A problem i just noticed is that in this case, the order of the points in space and the order of the resulting lines are both relevant, which they don't actually need to be. Mafes solution is definitively more elegant.
I should note to my defense that i am currently sick and my brain feels muddy.
Sorry for not replying. You guys pretty much covered what I needed, and I moved on.
I do have a new question, you guys might find interesting.
Let's say I have a set of n numbers. I have a target, which is the sum of all n numbers. Now let's say I create a new set with every k-combination of numbers. So, (n choose k) combinations. In this instance, let's say I choose 4. So I have (n choose 4) combinations made up of my n numbers.
How many combinations are there of my (n choose 4) combinations, that can equal my target sum?
Is there a general formula or easy way to calculate how many combinations can equal a target sum n, with any (n choose k) combinations?
Thank you!
edit: I should be more specific. I am asking about how many permutations there are to reach the sum, not combinations. Which makes me ask, is the answer (n/4)! ? edit2: nope.. it couldn't be. Is it (n/4)!*(n choose 4) ? This is mindbending.
S is the sum of all of your n numbers. M is the sum of k <= n numbers.
You are looking for the amount of picks for which M=S for a given k?
If you have only positive numbers, that amount of combinations (lets call it q) is 0 if k <n, because any sum of k numbers will always be smaller than the sum of those numbers + n-k additional positive numbers. If k=n, q=1.
If you also allow 0 in your base set, the amount of combinations depends on how many 0 are in there.
If you also allow negative numbers, i doubt that there is a general solution. It depends on how many combinations of numbers there are which cancel out, and is gonna get complicated.
We started with a set of numbers (1, 2, .... n), where the elements increment by 1, sequentially. The elements are positive integers, starting at 1.
We have a target, S, which is the sum of all numbers in this set.
We take this set, and we create (n choose k) unique combinations.
We now have a set with n choose k combinations in it: example, k = 4 our set: ((1,2,3,4),(1,2,11,n)...(11,23,73,88)...(7,55,386,n)) etc etc
Each element of this set has a value that is equal to the sum of the elements of the set. So, (1,2,3,4) has a value of 11. (7,55,386,n) has a value of (448+n)
How many permutations are there by which we can take k elements from our new set, and reach our target sum S? How many combinations are there? (probably the question I care more about)
Additional constraint: How many permutations are there by which we can take k elements from our new set, in which every every element of every set is unique?
Thanks!
Edit: Sorry, I am back from lunch. I am less rushed now, and have edited my post to be specific.
If we start categorizing the sums in your second set, we need to take a look at how often each sum appears.
You have 1 times (k+1)k/2, 1 times 1+2+3+5 in the example, so one times (k+1)k/2 + 1 2 times 1+2+3+6=1+2+4+5, so two times (k+1)k/2+2 and so on
maybe it helps to see what different sums one gets in what amount to figure stuff out? Not really at a conclusion yet, and also gonna eat stuff now?
But i am still not convinced that there is a simple answer. Even for very small numbers, the stuff fluctuates pretty hard.
n=3, k=1 has one solution (k=1 and k=n always means there is only one solution since both sets are equal) n=3, k=2 has none (second set is (1;2) (1;3) and (2;3) n=4, k=1 has one solution n=4, k=2 has three solutions, (1;2) (1;3) (1;4) (2;3) (2;4) (3;4), each of which has each number of the base set exactly once in it. n=4, k=3 has no solutions
If there is a general solution, it would have to factor in things like how easily k fits into n.
((n choose k) * ((n-k) choose k))/(n/k) combinations that will equal our sum,
(n choose k) is based on each individual starting element, and then ((n-k) choose k) are the combinations of what is left that can be combined with the starting element to form the sum. But then this all must be divided by (n/k), otherwise we will repeat with starting elements that have already been used in a previous combination.
and then permutations would be equal to combinations * k!, which is the possible orderings of the given combination.
so yeah, I did this and then I realized I wasn't considering duplication of elements. For example, making a sum of 45 with (1,2,3),(6,7,8),(1,8,9)
so I will be impressed if anyone can solve how many combinations there are that don't include repeats.
(I'll also be impressed if I didn't screw up my above conclusions)
I guess it could be said that when you have a combination, like above, trying to reach 45 with 3 combinations of 3 numbers, with no repeats, then you must use every number in n once.
so we can say there is 9 choose 3 possibilities for set 1. Then 6 choose 3 possibilities for set 2. Then only 1 possibility for set 3.
So instead of ((n choose k) * ((n-k) choose k))/k
It needs to be [the product from i = 0 to n/3 of: ((n-3i) choose k) ]/(n/k) ?
That sounds correct, but it only solves that questions of what to do without allowing doubles.
And you can further simplify your result.
n choose k = n!/(k! (n-k)!) n-k choose k = (n-k)! / k! (n-2k)! n-2k choose k = (n-2k)! / k! (n-3k)! with the last factor being n-(n/k - 1)k choose k = (n-(n/k-1)k)! / k! (n-n)!
so with all the (n-ik)! dropping from your product, that only leaves n!/((n/k)k!) or k*n!/n*k!
Edit which is (n-1)! / (k-1)! /Edit Edit2 Crap i suck at calculating, it is actually n! / k!^(n/k) /Edit2
Could be that i am missing something, but that should generally work for your result
And i have no idea how to solve the general problem, or if it is even possible. With the above simplification of only choosing once, any problem where k does not divide n has 0 solutions.
And then you want to know how many such B there are. Is that right?
If so, then for k = 1 there are no solutions, since the highest number you can pick is B = {n} and that is not big enough. But your number of combinations would be (using your first try) ((n choose 1) * ((n-1) choose 1))/(n/1) = n * (n-1) / n = n-1
If you don't want repeated elements (in the elements of B) then it can only possibly work if k is the square root of n. Since you can only include every number once, you need to have all of the numbers. At which point the question becomes 'how many ways are there of splitting k^2 elements into k sets, each of size k.'
The answer to that is because you have k^2! ways of listing 1 , ... , k^2 and then you say the first k are the first set, the second k the second, etc. But then I don't care about the order within the first k, so I divide by k!. I have to do this for all k sets of k, which leads to (k!)^k. Finally, I can swap the first set of k with the second set of k, etc. which gives another k! factor on the bottom.
@Travis, while my combinatorics is far from amazing, this question looks quite hellish XD Part of my motivation for this is that this turns into (or at least looks like it) a partition problem, and even finding the partitions of S with no constraints is tricksy.
However your fully constrained question looks a lot more achievable. If both of your k's are the same (so each combination mini-sum is of a k subset of (1, 2, ..., n) and you now want to take k of these k-subsets, and add all k mini-sums to get a total of S) and S is the sum from 1 to n, then you have a situation where you will need have each element of (1, 2, ..., n) appear exactly once. This is because, as per uniqueness, you can't repeat any number, so you need every number from 1 to n to appear in exactly one combination. (Again hoping I'm reading your question right).
This leads to the following observation: your constrained question only has ANY solution when n = k^2. (k subsets, each of k elements, all elements unique). Because if not, you can't even partition all of (1, 2, ..., n) across the k-subsets taken.
So the problem reduces to all the ways of arranging (1,2, ..., n), in k boxes, each box having k elements. But these arrangements correspond one to one with permutations of (1, 2,..., n). (*) Which is not particularly interesting Which leads me to conclude I have misread something :| Although your additional constraint may be strong enough that this must be the case.
If this wasn't at all clear I can type something up in LaTeX that is a bit less of a mess.
EDIT: to see (*), first note that, when we "remove" the boxes, we can't possibly get an arrangement that isn't already a permutation of (1,2,...,n), and that if we "impose" the boxes, every permutation of (1,2,...,n) is also a different box arrangement. Although I'm more dubious on this than I'd like =/
EDIT 2: RIP, no. This assumes permutations all the way through, not permutations of the set of combinations. So... I lie. I'll leave it up but I lie. The previous answer explains this better and gives an answer I can't yet see to be wrong.
On December 15 2017 04:38 Melliflue wrote: I'm not sure what you want. Is it this?
And then you want to know how many such B there are. Is that right?
If so, then for k = 1 there are no solutions, since the highest number you can pick is B = {n} and that is not big enough. .
there is a miscommunication here. we make n choose k combinations, but these combinations are of size n/k, not of size k.
so for k = 1, we would a single set of (1,2,3...n) which would be our single solution.
The reasoning for needing all numbers from 1 to n appearing exactly once still holds, so I think you can easily adapt the solution here. You will still have k of the combinations appearing (to get each element exactly once, you need exactly k subsets of size n/k), whose order we do not care about. And each combination now has n/k elements, and again, we do not care about the order of elements within the combination.
So I think this works?
EDIT: If you *do* care about the orders of the combinations (but not the elements within them), then just don't divide by the k!
EDIT 2: WTF the site I'm using for latex images changed it. It was n!/(((n/k)!)^k)(k!)
And the solution for that question is in Travis post #315 and then further simplified in my post #316. It is basically a slightly more general version of your k²!/k!(k+1). Though it might have gotten lost in the editing of posts while others were talking.
Though i didn't get that +1, might have overseen it somewhere.
The result was n!/k!^(n/k) and only works if k divides n. (Otherwise there is no answer to the question.
On December 15 2017 04:38 Melliflue wrote: I'm not sure what you want. Is it this?
And then you want to know how many such B there are. Is that right?
If so, then for k = 1 there are no solutions, since the highest number you can pick is B = {n} and that is not big enough. .
there is a miscommunication here. we make n choose k combinations, but these combinations are of size n/k, not of size k.
so for k = 1, we would a single set of (1,2,3...n) which would be our single solution.
Ah, sorry about that. So in what I wrote, you would need |B| = n/k. As Ciaus_Dronu said, the logic still works. Let j = n/k for convenience. There are n! permutations of 1,...,n. I split that into k blocks, each of size j. I don't care about the order within a block so I must divide by j! for each block, so that gives me (j!)^k. I don't care about the order of the blocks either so I divide by k! too. Final answer is
(I only repeated what Ciaus_Dronu said to add that expression since they had trouble with their latex image.)
I know this is almost what you and Simberto got too, minus a notational difference and a k! in the denominator for whether or not you care about the order of the blocks. I did miss some of the edits, but I wanted to give this explanation because I think it is neater. You get straight to the final answer without needing to simplify anything, and it nicely explains what that final expression means.
I am still thinking about the case where duplicates are allowed. I have not yet seen a simple way to do that.
Hey I was messing around with continued fractions and I came across something I definitely couldnt figure out myself and something Google didnt help me understand either
Basically I was wondering if there exists an elegant way to represent pi as an infinite continued fraction
On December 18 2017 06:36 amyamyamy wrote: Hey I was messing around with continued fractions and I came across something I definitely couldnt figure out myself and something Google didnt help me understand either
Basically I was wondering if there exists an elegant way to represent pi as an infinite continued fraction
I'm pretty sure there are several based on the wolfram alpha page, in particular these two:
Oh what I meant is if theres an elegant way to derive a continued fraction representation (simple - or non-simple like your example) of pi.
Obviously one could approximate pi and construct a continued fraction from that approximation but what im wondering is if theres another way to derive CF for pi
I need to find a summation, and it might be too difficult for me to figure out myself in any reasonable amount of time. If any of you can figure it out, I would be very, very thankful. So thankful that in the future you may possibly (unlikely, but possible), get compensation.
Say we have a number of elements, n.
I have a target I want to reach: ( [the floor of(n-1)/2] - n - 1)
This needs to be found over a sum of (n-2) steps. No more, no less. I need it to be distributed such that
1.) the summation uses all (n-2) steps and 2.) the size of the value being added in an earlier step is never lower than the size of the value being added in a later step and 3.) the size of the value being added in each step is as uniform as possible.
example:
n = 9 target sum: 22 number of steps: 7
example sum: 4+3+3+3+3+3+3
so, I realize the question might be a little non-specific, so there may be some wiggle room for interpretation. But hopefully it's enough information to garner a good general solution.
On December 25 2017 02:20 The_Templar wrote: Maybe he meant the floor of (n-1)^2/2, and not (n-1)/2?
That would make sense with the example given. So far splitting into cases where n is even / odd, I'm getting somewhere for even. Although it looks by far less technical and annoying than for n odd.
EDIT: Still long and annoying though
EDIT 2: Analysis for n even won't be fun to format here, can I link an online latex document when that's done? It'll only be for the even case unfortunately. Odd will likely take hours T.T
Sorry guys, I meant ( [the floor of((n-1)^2)/2] - n - 1)
So, (n-1) is squared.
that's why in my example, with a value of n=9 we do: n-1 = 8. 8^2 = 64. 64/2 = 32. 32-1-n = 22 so our target is 22
again, sorry about the error, I need to check these more closely before I walk away.
edit: just noticed the_templar predicted what I had meant. nice job
edit2: I'm able to solve it in python(well, one possible solution.. depending on interpretation I think there are more), but I still don't know how to translate it to math, and having the math may make it more likely I can extract useful information to make my algorithm better.
If it's helpful for you, this is the code:
max = ((n-1)**2)/2 - 1 - n levels = n-2 min = max/levels other = max-(min*levels) max = min+1 maxlevels = other minlevels = levels-other
this gives me a sum where the first "maxlevels" steps are of value "max", and the following "minlevels" steps are of value "min", and these values will vary by only 1
Okay here is an overleaf read-only thing you can view. It should let you generate a PDF. It only has the even solution, but I can do the odd one now quickly too.
EDIT: You can click PDF on the top bar of the screen to get something easier to read. EDIT 2: Lots of typos, that is a "live" link so I'll fix them as I see them.
I'm not able to get this to work nicely for odd case, keep getting a 1/2 term floating around that definitely doesn't work. I've probably done something not clever >.>
EDIT: Solved. For odd numbers greater than 7 anyway (everything smaller you can have as its own case). I'll put in the solution soon, I just need to shower and then I'll type it up.
EDIT: The conclusion is as follows: For n even, you want the first (n-5) terms to be n/2 - 1, and the last 3 terms to be n/2 - 2. For n odd, you want the first (n-1)/2 - 3 terms to be (n-1)/2, and the rest of the terms to be (n-3)/2.
With the steps I have included, it is also easy to spot the special cases where all the terms can be the same and where this can't be done at all, although that's not difficult on its own.
Not sure this is math. It's more signal processing, but guess I'll post it here.
I am trying to find useful descriptors for a completely irregular (discrete) time series. And when I say completely irregular, I mean that it looks random, and is made to look even more random after a dfft. Here's an example:
Every sample produces a series like this one. What I plotted here is the time series, the mean and mean + 1 std deviation for this sample, and mean, and mean +1 std deviation for the entire population. Mean - 1 std deviation is always < 0, so not very useful.
What are useful ways for describing data of this kind of data? Because I am mostly interested in the peaks, their spacing and things, I was thinking of using the following descriptors:
number of peaks > mean + 1 std dev. height of the highest peak (probably not meaningful data) %time > mean + 1 std dev. %time < mean. max time between peaks. min time between peaks.
But they seem completely ad-hoc choices. Are there more principled ways of describing a time series with what appears to be no periodical component?
Oh, what I forgot is that I can play with the window. I am now grouping events together by minute, but I can play with the window size, or do a sliding window (not sure what the point of that is in this case, other than to have a smoothing effect). Making the window size too small (10 sec) seems to make it uselessly jittery though. The peaks get broken up into seemingly meaningless rapid peak-vale-peak-vale intervals. 1 minute seems intuitively like a good trade-off between resolution and noise... but if there is a better way of analysing this, that would also help
Hey, seeing as no one else gave you an answer I thought I might add something. If I understand you correctly then as your sample points are seemingly random then there is no correlation to the time points, so why not leave time out of it an treat them as 1-dimensional samples. A good way of presenting such samples would be a box plot, which includes all statistical metrics commonly used.
What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?
The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.
On December 18 2017 08:17 amyamyamy wrote: Oh what I meant is if theres an elegant way to derive a continued fraction representation (simple - or non-simple like your example) of pi.
Obviously one could approximate pi and construct a continued fraction from that approximation but what im wondering is if theres another way to derive CF for pi
There is (unlike what is suggested by the uneducated people who have told you otherwise). But to understand it you would have needed to have deep understanding of jacobi theta functions.
However, I can give you a cleaner way.
Remember arctan(1)=Pi/4, right? but also, arctan(x) = x - x^3/3+x^5/5-x^7/7+x^9/9 - .... so you can write:
PI = 4 x (1-1/3 +1/5 -1/7 +1/9-1/11+1/13-...... )
Many other series of simmilar type abound. The best are in base 8 and base 16, though. The best one is called the BBP digit extraction algorithm because it literally allows you to finish computing one digit at a time before you have everything (think: base 16 more dense than base 10) It reads:
PI = Sum over k from zero to infinity 1/16^k * [ 4/(8k+1)- 2/(8k+4) -1/(8k+5)-1/(8k+6)]
Note that these aren't "magic numbers". They come from modular rings in base 17.
On February 23 2018 23:34 Joni_ wrote: Random pure maths question:
What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?
The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.
Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.
On February 23 2018 23:34 Joni_ wrote: Random pure maths question:
What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?
The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.
Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.
Care to elaborate? I only know the Frobenius automorphism as an automorphism of fields of prime characteristic.
Tbh Im not particularly knowledgeable in a lot of advanced C*-topics like K-Theory in general, it's mostly spectral theory/Gelfand theory for me, so might be that I simply lack the required conceptual background. :>
On February 23 2018 23:34 Joni_ wrote: Random pure maths question:
What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?
The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.
Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.
Care to elaborate? I only know the Frobenius automorphism as an automorphism of fields of prime characteristic.
Tbh Im not particularly knowledgeable in a lot of advanced C*-topics like K-Theory in general, it's mostly spectral theory/Gelfand theory for me, so might be that I simply lack the required conceptual background. :>
What I meant is you need to formulate a injective but not surjective map (or measure) from your Banach space to an appropriate field with more restrictive topology. Study what happens in that case; it will illuminate the full problem. This is a common strategy. Look at Tykhonov's theorem.
(after an hour of thinking and editing this post: ) Some thoughts: Cauchy's integral theorem tells us for which kind of domain a holomorphic function will definitely have an antiderivative. When you just look at the fundamental theorem, it's unclear if f(z)=1/z could ever have an antiderivative, even though it's holomorphic. But with Cauchy it's clear that it will have an antiderivative on a simply connected domain.
PS: I hope I'm making enough sense here. My "math english" is terrible :/
Your additional thoughts are exactly on point. The fundamental theorem talks about functions which have a holomorphic antiderivative as a base, which is not the same as holomorphic functions.
Cauchy's integral theorem states that holomorphic functions do stuff. It is thus stronger in that regard.
Meanwhile, as you noticed, once you know a holomorphic antiderivative exists, the fundamental theorem becomes stronger.
On February 23 2018 23:34 Joni_ wrote: Random pure maths question:
What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?
The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.
Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.
Care to elaborate? I only know the Frobenius automorphism as an automorphism of fields of prime characteristic.
Tbh Im not particularly knowledgeable in a lot of advanced C*-topics like K-Theory in general, it's mostly spectral theory/Gelfand theory for me, so might be that I simply lack the required conceptual background. :>
What I meant is you need to formulate a injective but not surjective map (or measure) from your Banach space to an appropriate field with more restrictive topology. Study what happens in that case; it will illuminate the full problem. This is a common strategy. Look at Tykhonov's theorem.
Thanks for the reply, I didnt even notice until today. Yet there seems to be some communication mishap (I guess I just dont quite get what you mean):
There will never be an injective map, that is also an algebra homomorphism from a general Banach Algebra into any field, as a Banach Algebra is not in general an integral domain while a field is.
A map that is not an algebra homomorphism will not allow me to deduce any algebraic properties of the elements of an algebra (like idempotence) from their images, so I dont see how that could help.
As far as I can tell Tykhonov's Theorem seems to be completely unrelated to my case, but maybe Im just not seeing the proper broader picture.
Hey Ciaus sorry for not replying, I got deep into my work. I will tell you I did use what you provided, but the algorithm itself did not pan out.
but anyways here I am with a new question for you math geniuses
I have written an algorithm that works with graphs. The long and short of it is, I arrive at a point where I have k sets, each has unique values within them. The sets vary in size.
So maybe I have 8 sets: three are size 1, two are size 2, two are size 4, and one of them is size 9.
I have to separate these k sets into n sets, where n is a value that is fixed ahead of time. Let's say n is 18.
Well, using basic math I can see that I have 3+4+8+9 = 24 values to work with. 24/18 = 1.3333.. average values per set, in order to form n sets.
Now here is where things get tough. The k sets are formed as such because they are disjoint, and the values must remain disjoint. This means that each set of size one, must remain size one. In a similar fashion, any of the other sets cannot be "mixed" with values from another set. So, sets of size two could never form a new set of size 3. Nor could a value from the set of size 9 be put into a set with a value from a set of size 4 to form a set of size 2. Essentially, values can only form a new set with other values they were originally matched with.
Using this information, is it possible to actually solve what the sizes of my n disjoint sets must be?
DISCLAIMER: The above example is not a real example and likely does not have an actual solution. If needed, I could run my algorithm to get a real world example.
If I understand the problem correctly then there are obviously only solutions for n greater or equal than k. Cant you just sort the k sets in ascending order and pick single elements to form new size 1 sets until you have as many sets as you want? Now calculating the sizes of the new sets shouldnt be a problem. (You have all untouched old sets, some new size 1 sets and maybe 1 half-used old set).
@travis: awesome, glad it was of some help, sorry things didn't work =/
For the new problem, is there any requirement other than simply the number of new sets you must have? If not, then @Kleinmuuhg's answer looks perfect. If you want to keep sets small-ish maybe, you can just subdivide up the largest sets first, until you have n in total.
Also while I'm here, here's a fun thought-experiment that uses a bit of basic graph theory and probability intuition:
Say you are running some NGO which delivers vaccines to countries suffering epidemics. You only have enough vaccines to vaccinate some small portion of the population (say less than 5%), and you can't gather any data on the population / distribution / individuals in the country, all you can do is randomly give people the sealed vaccine, and some instructions. You may assume the people you give the vaccine to will follow your instructions.
What instructions do you give to try most effectively stem the epidemic?
With varying extra details and depth, the very basic idea is as follows: You instruct the person you give the vaccine to not to take it themselves, but rather to give it to a friend, who should then take it. Those with more friends are hence more likely to get vaccinated. The assumption being that a person's capacity to spread the infection is proportional to how much they interact with others in their surroundings. In essence, the strategy targets those with greatest ability to spread infection.
Explicitly modeled, your population is a graph. People = vertices, friendships = edges. More incident edges, more likelihood of receiving vaccine from a friend as well as higher expected amount of infection spreading.
On May 17 2018 04:57 Kleinmuuhg wrote: If I understand the problem correctly then there are obviously only solutions for n greater or equal than k. Cant you just sort the k sets in ascending order and pick single elements to form new size 1 sets until you have as many sets as you want? Now calculating the sizes of the new sets shouldnt be a problem. (You have all untouched old sets, some new size 1 sets and maybe 1 half-used old set).
On May 17 2018 05:28 Ciaus_Dronu wrote: For the new problem, is there any requirement other than simply the number of new sets you must have? If not, then @Kleinmuuhg's answer looks perfect. If you want to keep sets small-ish maybe, you can just subdivide up the largest sets first, until you have n in total.
yes, n will always be >= k
and I failed to mention the stipulation that every element in all of our sets must be used (and every element is unique, you will not see the same value twice).
in terms of the actual problem, I am not looking for what values would go in what set. that actually has an entirely new set of constraints. so, what I am hoping for is "potential solutions" in terms of the amount of sets and the sizes of them. but I think for the answer you are giving this doesn't matter.
anyways let me see if my understanding of the suggested solution is correct:
we have M elements in K sets, trying to form N sets of size Y
#at each step, we can stop and use this solution and/or backtrack to find another solution 1.) form N sets of size 1, first using the sets of size 1, and then using at least one value from each of the other sets 2.) kind of like... fill in these sets with the remaining values? potentially a huge amount of ways they could be put in... is it the case that every time we do #2 it will give us a potential solution?
On May 17 2018 05:40 Ciaus_Dronu wrote: Also while I'm here, here's a fun thought-experiment that uses a bit of basic graph theory and probability intuition:
Say you are running some NGO which delivers vaccines to countries suffering epidemics. You only have enough vaccines to vaccinate some small portion of the population (say less than 5%), and you can't gather any data on the population / distribution / individuals in the country, all you can do is randomly give people the sealed vaccine, and some instructions. You may assume the people you give the vaccine to will follow your instructions.
What instructions do you give to try most effectively stem the epidemic?
With varying extra details and depth, the very basic idea is as follows: You instruct the person you give the vaccine to not to take it themselves, but rather to give it to a friend, who should then take it. Those with more friends are hence more likely to get vaccinated. The assumption being that a person's capacity to spread the infection is proportional to how much they interact with others in their surroundings. In essence, the strategy targets those with greatest ability to spread infection.
Explicitly modeled, your population is a graph. People = vertices, friendships = edges. More incident edges, more likelihood of receiving vaccine from a friend as well as higher expected amount of infection spreading.
instructions: kill anyone who is infected
ok no but actually I would have never gotten the recommended solution, which is actually pretty damn clever
On May 17 2018 04:57 Kleinmuuhg wrote: If I understand the problem correctly then there are obviously only solutions for n greater or equal than k. Cant you just sort the k sets in ascending order and pick single elements to form new size 1 sets until you have as many sets as you want? Now calculating the sizes of the new sets shouldnt be a problem. (You have all untouched old sets, some new size 1 sets and maybe 1 half-used old set).
On May 17 2018 05:28 Ciaus_Dronu wrote: For the new problem, is there any requirement other than simply the number of new sets you must have? If not, then @Kleinmuuhg's answer looks perfect. If you want to keep sets small-ish maybe, you can just subdivide up the largest sets first, until you have n in total.
yes, n will always be >= k
and I failed to mention the stipulation that every element in all of our sets must be used (and every element is unique, you will not see the same value twice).
in terms of the actual problem, I am not looking for what values would go in what set. that actually has an entirely new set of constraints. so, what I am hoping for is "potential solutions" in terms of the amount of sets and the sizes of them. but I think for the answer you are giving this doesn't matter.
anyways let me see if my understanding of the suggested solution is correct:
we have M elements in K sets, trying to form N sets of size Y
#at each step, we can stop and use this solution and/or backtrack to find another solution 1.) form N sets of size 1, first using the sets of size 1, and then using at least one value from each of the other sets 2.) kind of like... fill in these sets with the remaining values? potentially a huge amount of ways they could be put in... is it the case that every time we do #2 it will give us a potential solution?
Hmm, there are actually a huge number of ways you may be able to end up doing this. If n >> k, and some of your original k sets are themselves large, you end up with something worse than set-partitions in terms of how many ways you can break it up.
But let's be explicit. Let m be the total number of (unique) numbers in our k sets. Let n be the target number of sets.
Constraints: There exists at least one way of getting our m numbers into n new sets if and only if:
Only if: The first inequality because you can never merge old sets, only break them up or keep them the same. The second inequality because every set needs at least 1 number in it, so you can't have fewer total numbers than sets.
if: If both inequalities hold, and n = k, then the sets as they are given are fine. If k < n, choose any set with more than 1 element (which must exist as k < m) and break it into 2 sets. We've now increased "k" by 1. Keep doing this until "k" = n. (I say "k" as we've essentially done induction to get a new value for the number of sets we already have).
Horror of counting all possible solutions The issue is that there are a huge number of ways to break up your k old sets into n new ones (every old set will just be some subset of an old one). Finding even a rough solution is nightmarish at best. Let's give an example to illustrate. Say k = 3, n = 5, m = 15. Let our 3 starting sets be A, B, C with |A| = |B|= |C| = 5.
To get to 5 sets from these 3, we could partition C into 3 subsets, and now we'd have 5 new sets, done. Call the number of ways of doing this {5, 3}. I think it's called the Stirling number of the second kind, and it's generally large. We could instead, however, partition just A or just B into 3. But... We could rather partition A and B into 2 and leave C as is, that'd also give us 5 new sets. Same for all pairings. As you can see this gets... well I'd never tackle this as a research problem unless I wanted to be only speaking in generating functions by the end of it.
Easy way to get any solution All hope is not lost though. If ALL you care about is finding some way of turning your k sets into n sets, with those same elements, by only breaking sets up and never merging them, then this is easy. First, check that our first inequality holds (otherwise no solution exists). Second, Check if k=n, if so, we are done. If not, take the largest set you have. Cut off 1 single element and make a new set with just that element. You now have k+1 sets. Find the new largest set (it might not be the same as in the previous step) and again, take off 1 element and make that element its own set. Keep doing this until you have n sets.
In our previous example with A, B, C, the algorithm would take 1 element from, say, A, at step 1. Then another element from say B (NOT A, as it is smaller than B after we chopped an element off of it) at step 2. Then you'd be left with C, two sets of size 4, and two sets of size 1.
That last part of ciaus_dronus answer is basically what I suggested. I think he explained it pretty good. If needed I can type up some pseudo code if you want
Agreed. I don't actually think the problem is that hard at all. You just need a principled way to run through your sets. Given that you have some kind of condition for what elements can be inside a set, which you haven't specified, you can simply iterate through this generation process, and stop when you find an acceptable solution. Greedy algorithms work!
Of course, if you can somehow use your acceptability criterion to restrict the search space, then you should. But if you can't and can only decide whether a solution is acceptable or not, then just iterate greedily until you find an acceptable partition.
Oh, of course, if you have a very tiny number of actually acceptable solutions, or you are looking not just for any solution, but for the best solution, then this is not going to work as well. But for just finding *a* solution, greedy is fantastic.
If your additional constraints involve a specific ordering of the elements within the sets and how they can be divided according to that, I thought of an easy way to both count and generate possible partitions. Basically, if you can "linearise" the problem a bit and get rid of partitions that don't look like intervals when you give the elements some order, it's not too hard to count all possible partitions, and to see how you get them.
If this is of value to you, I can type it up. But it's a very specific constraint so maybe not =/
The gist is to model covering in the order relation inside each partition as edges between values (ie: edge between u and v if v is the smallest value bigger than u). Then you just delete any n-k edges and get a graph on n components corresponding exactly to an allowed partition into n sets.
I'm having an issue (dispute) with someone and need some clarity, as my brain is fuzzing whenever I try to properly prove it one way or the other.
The core issue is this: trying to calculate the expected cost of something, in the context of a game.
basically; there's a yearly chance that the event might happen if you pay a fee. say X% a year. and each year, you have to pay K. i.e. each year you pay K, and you get an X% chance of the event.
now there's a modifier N, which can sometimes affect this, and i'm trying to figure out how the expected cost varies as a function of N, roughly. it doesn't have to be exactly right, as the dispute is simply about the scaling rate.
N typically has values near 1, with the baseline case being where N =1; it always has positive value I think. all values are between 0.8 and 1.5
N has the following effects: it causes the yearly cost to be K*N and it modifies the yearly chance of the event happening; I think to X*(1/N) though it might be to X*[2-N] which unfortunately is a whole different calculation, though they're fairly similar over the range of values actually used.
You are not being 100% clear. Are the following true?
If you pay, event might happen, otherwise it doesn't. Event is positive, and you with to know how much the average cost of the event is, without caring how long it takes for the event to happen? So an example would be a lottery, where you pay a monthly fee for a change of winning a cruise.
For the base case, the expected cost of the event is K/p, where K is your cost and p is the probability of the event happening (The percentage X you declared, divided by 100). The reason for that is that you expect on average p events happening per year that you pay (It doesn't matter that p is smaller than 1). So you pay K for p events, thus the cost of one event is K divided by p.
The same logic still works in your modified case. Through a change of N, the cost changes to K*N, and the amount of events per year is changed to p/N. Thus, you pay K*N for p/N events, and thus each event costs you K*N divided by p/N. Fractional maths means that this is equal to K*N*N/p. So in this case, the total costs is K*N² / p
In the second case, a similar argumentation leads to K*N / (p*(2-N))
On May 25 2018 09:06 Simberto wrote: You are not being 100% clear. Are the following true?
If you pay, event might happen, otherwise it doesn't. Event is positive, and you with to know how much the average cost of the event is, without caring how long it takes for the event to happen? So an example would be a lottery, where you pay a monthly fee for a change of winning a cruise.
For the base case, the expected cost of the event is K/p, where K is your cost and p is the probability of the event happening (The percentage X you declared, divided by 100). The reason for that is that you expect on average p events happening per year that you pay (It doesn't matter that p is smaller than 1). So you pay K for p events, thus the cost of one event is K divided by p.
The same logic still works in your modified case. Through a change of N, the cost changes to K*N, and the amount of events per year is changed to p/N. Thus, you pay K*N for p/N events, and thus each event costs you K*N divided by p/N. Fractional maths means that this is equal to K*N*N/p. So in this case, the total costs is K*N² / p
In the second case, a similar argumentation leads to K*N / (p*(2-N))
ok i'll try to clarify, though it looks like you got it all anyways.
obviously have to specify that N <= 2 for the second case, because probabilities cannot be negative.
moreover, that is right for the base case, where you know that N is applied or not. In order to compute the general case, where you cannot say a priori whehter N is active or not, you need to know the probability that the modifier N is applied.. The general case will also allow you to compute the probability of the event occurring in at most Y years, and things like that.
I've got another probability question. A bit similar to the last one I asked, but I'm still stumped.
The goal is to get to 0. If you are at some other number, you roll to see what number you to go next. Either you stay at your current number or you move up. You continue rolling until you get to 0.
See the chart below. The first number is the number you are at, and the second number is the number you can potentially go to, followed by the chance of getting there. So for example, if you are at 1 you have a 50% chance of rolling to 0 and stopping after 1 roll, and a 50% chance of staying at 1 and trying again.
The question is: What is the expected number of rolls to get to 0 for each number? I already have the answers by simulation: 1 = 2, 2 = 2.667, 3 = 3.144. But I'm stumped as to how to actually calculate that.
Calculating at 1 is easy. Without even having to do any calculations, you can easily see that if you have a 50% chance of winning, you will win on average 2 times.
Calculating 2 is trickier. You have a 25% chance of rolling 0, therefore ending in 1 roll. You have a 50% chance of rolling to 1. Since I already know the expected value at 1 is 2 rolls, that means at 2 you have a 50% chance of winning after 3 rolls. But what is the value if you roll from 2 to 2? There's a convergence there that I'm not sure how to calculate. Any help would be appreciated. Thanks!
Edit: Okay Im not typing this, sorry. Hope you can read my scribbling Not sure why our results are different for 3. If you have any questions, please ask and I can elaborate more if needed.
Sounds like a typical get-a-hypothesis-by-juggling-with-numbers-into-proof-by-induction exercise: Basically: E[W_n]= 1/(n+1)* (E[W_n]+...+E[W_1]+E[W_0]), which means you can calculate E[W_n] if you know all "lower" expected values.
However, now that I read your numbers again, it seems that not every number has the same probability? You should definitely explain what the number of rolling x is when you can roll numbers up to n. Do you mean to say you can only roll 0 or 1 and then subtract the result from n? In any case, you should be able to modify the recursive formula to get get proof by induction/recursion.
Edit: Ok my formula is wrong, take the one from the screenshot it the post above. The idea should still stand. Edit: Also, 22/7 would round to 3,142 or 3,143 but never 3,144. Is that just a typo?
I'm looking at a smaller example, a 4-bit number instead of 32-bit. Take a number from 0 to 2^4-1 and convert it to binary. If in binary a number has one 0 and three 1's, what is the probability that a bit-wise OR against another 4-bit number will yield a number with all four 1's (15)? It's 50%. The chart I provided earlier is the probability of going from a 4-bit number with x 0's to a 4-bit number with y 0's. Since it's a bitwise OR operation, you will never lose 1's.
I'm sure there are better approaches to the problem, but this is the direction I'm trying to solve it from. In fact, it's looking like I should probably try a different approach, as it's quite complicated to determine those probabilities.
Edit: Also, 22/7 would round to 3,142 or 3,143 but never 3,144. Is that just a typo?
It was just a guess from the results of my simulation. You're correct mathematically, but my simulated results would vary further than just a rounding error.
Aren't the bits completely unrelated in that situation? Meaning you can take the problem as 32 copies of the 1 bit problem and then recombine them later on?
Possible idea for a approach not completely thought through + Show Spoiler +
A single bit has a probability of P0(n) = 1/2^(n+1) to NOT be a 1 after n operations. And P1(n) is thus 1-(1/2^(n+1))
32 such bits thus have a probability of at least one of them not being a one of 1-(P1(n))^32. Maybe it is possible to somehow get an expected value out of this.
Though i am slightly confused by the "rounded to 10 digits after the decimal point" thingy. That implies some sort of numerical or simulationary approach in my opinion, because why would you want 10 digits if either the result could be described as an exact number in some way (fraction, roots or whatever) or the point is just a mathematical approach.
On July 17 2018 06:57 Simberto wrote: Aren't the bits completely unrelated in that situation? Meaning you can take the problem as 32 copies of the 1 bit problem and then recombine them later on?
yes , at first sight it looks like you could model this problem with i.d.d. random variables , then regard the 1-dim case and multiply your way up from there
Yeah I'm pretty sure there are better ways to solve it. This just kinda how I started it, so I wanted to see how far I can get. You guys helped me out, I got exactly what I needed. Thanks!
2.) given a binary string of length 15, 4 of the positions are set to 1 (at random). the rest are zero. how many strings can be made that have no ones next to each other?
-wtf, I have no idea how to do this :/
3.) similar to #2. We have 4 As, 3 Bs, 2 Cs, 1 D, 1 E. We use all the letters to form a word at random (all letters must be used). How many words contain no Bs next to each other?
i also have no idea how to do this one, it seems very similar.
I know how to do #1 and maybe #2. #3 I thought I had but some smaller examples have convinced me otherwise. I suspect I will be able to get it more easily in the morning - a bit rusty at this.
For #1 I think you're supposed to assume that all similar shapes are the same, so choosing circle #1 results in the same outcome as if you chose circle #2, for example. In that case, you should set up some kind of linear equation that satisfies the condition (picking exactly 4 shapes from 4 types of shapes), include restrictions on the number of circles and triangles you have, and figure out the number of solutions to that equation.
For #2, I think I solved it correctly by assuming that every 1 is either the first digit in the string or preceded by a 0. Therefore, you can break up the problem into 2 cases — one where the first digit is 1, and one where the first digit is 0 — and figure out the number of ways in each case to rearrange the appropriate number of 0s and "01"s. For example, in the first case, the first digit must be 1, after which point you have three "01"s and 8 other 0s.
On September 07 2018 13:00 DarkPlasmaBall wrote: For #1, why is it 16 and 15 instead of 17s?
This. Also a bit rusty, but shouldn't you multiply the second line by 4, and the third line by 6, because the order in which that happens doesn't matter?
For #2, seems like you can solve that with dynamic programming. The chance of having 2 1s next to one another is:
Let c1 be the chance the first digit is 1, and c2 be the chance the 2nd digit is 1, given that the first digit is 1.
Concerning #1: I am guessing we do not distinguish the circles, etc. from one another? That'd make for a whole lot more possibilities.
#2: the way these problems are usually handled are that you imagine the 11 zeros side by side and the 4 ones as 'partitioners' which you can place at any of the 12 positions in between the zeros or to either side. Think of it as 11 blue cars parked in a row. You now want to park your 4 red cars. The first one you can put in between two blues (10 spaces) or in front of all or in the back of all, makes a total of 12 possibilities. For the next car you have one less possibility, etc. If all the red cars were different you would have 12*11*10*9 possibilities. But you cannot distinguish between two red cars so you have to account for that. This leads to 12 over 4 as your end result. (drawing without putting back, without caring for oder, binom coefficient)
I wrote it in Python. For my dynamic programming solution, I get:
1 - 0.6373626373626374 = ~36%
Using Kleinmuuhg's method, the probability is (12 over 4)/12⁴ = 0.02387152777
What am I doing wrong?
E: I also just saw that the question is not about the probabilities, but the actual combinations. Either way, there is something wrong somewhere :D
E2: checking it for strings of length 3 and 2 1s, I think it's clear that both kleinmuuhg and my method work, but I fucked up on the total number of possiblities. It isn't 12⁴, it's 15 choose 4. And then it all works.
A simple recursive Python solution for #2 that seems to produce reasonable results:
def get_count(zeros, ones): if ones == 0: # Nothing, or only 0s left return 1 if ones == 1: # A single 1 can be placed in multiple positions return zeros + 1 if zeros == 0: # All other combinations require 0s to work return 0
# The remaining string starts with either 0 or 10 return get_count(zeros - 1, ones) + get_count(zeros - 1, ones - 1)
get_count(11, 4)
Not sure it's correct, but this gives a result of 495.
This can be solved by means of simple dynamic programming. Basically, the number of ways of selecting x number of 1's from a binary string of length y is the sum of the number of ways of selecting x - 1 number of 1's from a binary string of length y - 2 for all of x. Here's my c# code:
private int Solve(int maxSize, int maxCount) { int[,] data = new int[maxSize + 1, maxCount + 1]; for (int spot = 1; spot <= maxSize; spot++) { data[spot, 1] = 1 + data[spot - 1, 1]; for (int count = 2; count <= maxCount; count++) { for (int position = spot; position - 2 >= 1; position--) { data[spot, count] += data[position - 2, count - 1]; } } } return data[maxSize, maxCount]; }
#3 can't be trivially reduced to #2 since "abc" and "cba" are still distinct words even if the only "b" is always second.
The solution should be pretty straightforward with combinatorics. Anyhow, in Python (combined with the solution for #2):
def get_word_count(letter_counts): if sum(letter_counts.values()) == 1: return 1 # Only one letter left
def sub_count(letter): if letter_counts.get(letter) == 0: return 0 # Invalid letter to continue with else: return get_word_count({k: (v - 1 if k == letter else v) for k, v in letter_counts.items()})
return sum([sub_count(letter) for letter in letter_counts.keys()])
get_count(8, 3) # Distinct positions of b get_word_count({'a': 4, 'c': 2, 'd': 1, 'e': 1}) # Different words "around" b:s
On September 08 2018 03:43 BByte wrote: #3 can't be trivially reduced to #2 since "abc" and "cba" are still distinct words even if the only "b" is always second.
You are correct! I was a little too excited and glossed over that fact.
Curious though, do we know if the problem is asking for how many distinct words? Since my solution finds all the distinct ways of arranging B's so they're never next to each other, then it's just a couple quick calculations based on whether or not the problem is asking for only distinct solutions. Your number is likely right since it's divisible by 84 :D
and a hint at the solution for #3 would be (Possibilities of mixing A,C,D,Es )* (Possibilities of distributing 3 Bs in between 8 other letters) Where the latter part works exactly the same as in #2
On July 17 2018 04:24 Kleinmuuhg wrote: Edit: Okay Im not typing this, sorry. Hope you can read my scribbling Not sure why our results are different for 3. If you have any questions, please ask and I can elaborate more if needed.
Oh btw, I wanted to say thanks again for this. It helped me solve a few more problems so I think I finally have a good understanding of it.
On July 17 2018 04:24 Kleinmuuhg wrote: Edit: Okay Im not typing this, sorry. Hope you can read my scribbling Not sure why our results are different for 3. If you have any questions, please ask and I can elaborate more if needed.
Oh btw, I wanted to say thanks again for this. It helped me solve a few more problems so I think I finally have a good understanding of it.
Alright, it's been a bit but im back. I imagine I did quite poorly on the homework since I did not know how to do these.
Had to use purely math by the way. No computers lol
for #1, I think all of you were right about your criticisms. for #2, I couldn't use dynamic programming! but klein, your way of thinking about it is brilliant. It's kind of how I approached it, but I ended up doing it.. a little wrong. I didn't conceptualize it as clearly as you did.
for #3, does anyone have a mathematical explanation? I did it wrong anyways, I saw Acrofales post and have a lot of faith in him so I approached it like that even though my brain was telling me the problem was different. I was low in time, lol. I am a bit behind in my classes I am going to have to stop focusing on side projects or im gonna get creamed in midterms.
On September 09 2018 11:26 travis wrote: Alright, it's been a bit but im back. I imagine I did quite poorly on the homework since I did not know how to do these.
Had to use purely math by the way. No computers lol
for #1, I think all of you were right about your criticisms. for #2, I couldn't use dynamic programming! but klein, your way of thinking about it is brilliant. It's kind of how I approached it, but I ended up doing it.. a little wrong. I didn't conceptualize it as clearly as you did.
for #3, does anyone have a mathematical explanation? I did it wrong anyways, I saw Acrofales post and have a lot of faith in him so I approached it like that even though my brain was telling me the problem was different. I was low in time, lol. I am a bit behind in my classes I am going to have to stop focusing on side projects or im gonna get creamed in midterms.
For #3 I'd definitely recommend looking at Klein's hint above. Basically, the recommendation is to count the number of possible words by saying: "When generating a word, I will first choose the positions of the Bs, and then I will choose the order of the remaining letters and fill them into the open spaces". The important things you have to ask when doing a constructive argument like this are:
- Am I not overgenerating? That is, can I generate the same word twice in two different ways and thereby end up counting it twice? - Am I not undergenerating? That is, are there some valid words that my method won't generate, and thus won't be counted? - In this case, where there's multiple steps, are the steps independent? If not, you can't multiply them together to get the final count.
If you can convince yourself of these three things, then it simply becomes (number of ways to place the Bs) * (orderings of the remaining letters) The first is an identical problem to #2, and the second is a more straightforward calculation. + Show Spoiler +
If I'm not too rusty, it should be (8 choose 4) * (4 choose 2) * (2 choose 1)
If you want to have another crack at #3 yourself, just read raNazUra's post. If you get stuck, here's the mathematical solution I wrote up for #2 and #3.
Before jumping into the details of the problems, let me go over the "stars and bars" idea. Imagine I have 3 stars * and 2 buckets [], and I want to count how many ways I can put the stars in the buckets. Our example is tiny, so we can just draw pictures of all the possiblities: [***] [] [**] [*] [*] [**] [] [***]
Now the key idea is to ignore the buckets themselves, and focus on the divisions between the buckets, which we can draw with a bar |. Using bars instead of buckets, our pictures look like this: ***| **|* *|** |***
2 buckets means 1 division/bar, and we still have the 3 stars, so there are (3+1) choose 1 possibilities.
If we change the scenario to have 3 buckets, then there will be 2 divisions/bars. Now our pictures look like *|*|* or *||** or |***|, and there are (3+2) choose 2 possibilites.
In general if we have N stars and M buckets, then our pictures will have N stars and M-1 bars, and there are (N+M-1) choose (M-1) of these. (Equivalently, N+M-1 choose N.)
------------------------------------
OK, now let's use this idea for problem #2:
We know the string will have 4 ones, and then we can imagine 5 "buckets" where we can put the 11 zeroes: [ ] 1 [ ] 1 [ ] 1 [ ] 1 [ ] <--- 11x 0s need to go in
To avoid consecutive ones, it is necessary and sufficient to put a zero in each of the middle 3 buckets: [ ] 1 [0] 1 [0] 1 [0] 1 [ ] <--- 8x 0s need to go in
The remaining 0s can go into the buckets without any restrictions. We just need to count the number of ways to distribute 8 things in 5 buckets. Using our result above, there are 12 choose 4 = 495 possibilities. For example, one possiblity for putting in the 8x 0s is 00||0|00|000, which produces 001010010001000 when combined with the 1s and 0s we already placed. So, there are 495 of these binary strings.
------------------------------------
On to problem #3:
First let's just worry about consecutive Bs, and ignore the distinction between A, C, D, E by saying we just have 8 other undetermined ? letters for now. So we have 3x Bs and 8x ?s, and we can apply the method from problem #2: [ ] B [?] B [?] B [ ] <--- 6x ?s need to go in
There are (6+3) choose 3 = 84 possiblities.
Now for each of the 84 possibilities, we need to turn the 8 ?s into ACDEs. For example, ??B??B??B?? could turn into AABAABCCBDE. There are 8! ways to do this ignoring duplicates, which we can account for by dividing out 4! * 2! * 1! * 1! (because of 4 As, 2 Cs, 1 D, 1 E). So we get 840 distinct words from each of the 84 possiblities.
One last thing, we need to ask ourselves if 2 different possibilities could produce the same word. For example if B?B?B?????? can turn into the same word as ??B??B??B??, then we need to worry about overcounting. But that can't happen, because the Bs are in different positions.
So the final count is just 84 * 840 = 70560 words.
Wow, thank you, that is fantastic. I'll need to practice this, I'll probably end up coming back with a couple more combinatorics questions.
In the mean time I have another question, it's very specific and it's not actually for school. I really should be able to solve this... but I have so much going on in my brain lol and I think this should be very simple for some of you.
I have a system of equalities:
First, I will say that In this system, I know the actual value of A,B,C,D,E,F. I want to solve for G,H,I,J,K,L. I know that there *is* a solution of whole numbers. And I know the following.
A = G+H B = I+J C = K+L D = G+H+I+J+K+L E = G+I+K F = H+J+L
So, if I know the integer values of ABCDEF, is it possible to solve for GHIJKL?
On September 11 2018 07:54 travis wrote: Wow, thank you, that is fantastic. I'll need to practice this, I'll probably end up coming back with a couple more combinatorics questions.
In the mean time I have another question, it's very specific and it's not actually for school. I really should be able to solve this... but I have so much going on in my brain lol and I think this should be very simple for some of you.
I have a system of equalities:
First, I will say that In this system, I know the actual value of A,B,C,D,E,F. I want to solve for G,H,I,J,K,L. I know that there *is* a solution of whole numbers. And I know the following.
A = G+H B = I+J C = K+L D = G+H+I+J+K+L E = G+I+K F = H+J+L
So, if I know the integer values of ABCDEF, is it possible to solve for GHIJKL?
Clearly doesn't work for arbitrary values. So first check that D=E + F and D= A + B + C.
If that's the case, then I'm not sure there is a unique solution, but it's just linear algebra. Write it as a matrix and solve using Gaussian elimination.
I have come to the conclusion that this system does not have a unique solution.
For there to be a solution, it is necessary that D = E+F and D = A+B+C. If that is not the case, there is no solution. And if D = E + F, then equation 4 is superfluous and does not add any additional information to the set of equations 1,2,3,5,6, which means that you don't have a set of 6 equations to solve for 6 variables, which means that the system does not have a unique solution.
There seems to be another superfluous equation, since you can easily get a solution by choosing, for example, G=I=0 (Other choices are also valid, and G doesn't have to be equal to I, this is just the laziest solution i came up with) The solution in that case is, G=0 H=A I=0 J=B K=E L=F-A-B.
So your solution has two degrees of freedom due to not having enough equations.
On September 11 2018 21:02 Simberto wrote: Firstly, Acrophales is correct.
I have come to the conclusion that this system does not have a unique solution.
For there to be a solution, it is necessary that D = E+F and D = A+B+C. If that is not the case, there is no solution. And if D = E + F, then equation 4 is superfluous and does not add any additional information to the set of equations 1,2,3,5,6, which means that you don't have a set of 6 equations to solve for 6 variables, which means that the system does not have a unique solution.
There seems to be another superfluous equation, since you can easily get a solution by choosing, for example, G=I=0 (Other choices are also valid, and G doesn't have to be equal to I, this is just the laziest solution i came up with) The solution in that case is, G=0 H=A I=0 J=B K=E L=F-A-B.
So your solution has two degrees of freedom due to not having enough equations.
As the conditions D = E+F and D = A+B+C have to be enforced on the input, it is easier to get the 2 superfluous equations by keeping the equation for D and considering one (any) of the first 3 and one (any) of the last two equations to be superfluous.
But of course the problem is much easier to solve (as you did) by only removing the D equation, setting G and I to arbitrary values and then the remaining variables are a simple elimination without any need of equation transformation.
You could use one set as all elements (a,b,c,d) of N^4 where a+b+c+d = 40 and another set as all elements of (a,b,c,d) of N^4 where the set of inequalities is true. Both of those should be easy to count, and if you also get the number of elements in the union of the sets somehow, you can use inclusion-exclusion to get the number of elements in the intersection.
That is the only way i see to apply inclusion-exclusion here. Maybe the number of elements in the union is easier to find than the number of elements in the intersection?
The cardinality of the whole N^4 is infinite. But the cardinality of N^4 which also satisfy a+b+c+d=40 is not.
(Basically, it is a fancy way of saying "find out how many numbersets of 4 integers which satisfy a+b+c+d=40 exist")
Edit: I have something else i want to write, but people are replying to quickly.
Edit2: If you want to use inclusion exclusion, you need (at least) two sets which partially overlap. I see the two sets of "4 integers with a+b+c+d = 40" and "4 integers with the other 4 inequalities") which do partially overlap. The cardinality of each of those sets should be easy to find, so if you could also find the cardinality of the union of the sets, inclusion/exclusion would deliver to you the cardinality of the intersection, which is what you were asking for originally. However, i don't see a very obvious way to finding that cardinality of the union, which might mean that this is not the best path either. I don't see another way how inclusion/exclusion would be useful for this, though.
I suggest not using N or N^4 as one of the sets as we need to add /subtract cardinalities and those sets have, as was pointed out, cardinality infinity. Only use finite sets for inclusion-exclusion principle
On September 24 2018 05:31 Simberto wrote: The cardinality of the whole N^4 is infinite. But the cardinality of N^4 which also satisfy a+b+c+d=40 is not.
This set might be a good idea but it is so huge I don't feel that way.
Edit: Not a good idea, see longer explanation below.
IMO everyone is way overcomplicating this problem.
On September 24 2018 04:47 travis wrote: I have an inclusion/exclusion principle question. I have been given a formula a+b+c+d = 40 2<=a<=15 0<=b<=20 0<=c<=25 1<=d<=25
It says to find the number of integer solutions using the exclusion/inclusion principle. I am... confused.
The answer I found is 5502, but I did not find it with exclusion/inclusion principle. I am not sure how to apply exclusion/inclusion here.
First, you should reduce the problem to a+b+c+d =37 with 0 ≤ a ≤ 13, 0 ≤ b ≤ 20, 0 ≤ c ≤ 25, and 0 ≤ d ≤ 24, which is functionally the same thing. Assuming you know how to find the number of solutions to a+b+c+d = 37 with no restrictions, you can use exclusion/inclusion to subtract the cases where a > 13, b > 20, c > 25, d > 24.
On September 24 2018 05:31 Simberto wrote: The cardinality of the whole N^4 is infinite. But the cardinality of N^4 which also satisfy a+b+c+d=40 is not.
This set might be a good idea but it is so huge I don't feel that way.
I dunno. It doesn't seem so bad. The solution can be found in a similar way to the earlier problems, by seeing it as ways of placing 3 blue cars in between 40 red cars (thereby dividing the space up into 4 sets of red cars that sum to 40). There are 43!/(40!3!) = 12341 ways of doing that.
Imho the problem is that I agree with simberto that I don't see a way of using inclusion exclusion, because I can't find an easy way of calculating the union of that set and the set with the individual constraints.
On September 24 2018 05:31 Simberto wrote: The cardinality of the whole N^4 is infinite. But the cardinality of N^4 which also satisfy a+b+c+d=40 is not.
This set might be a good idea but it is so huge I don't feel that way.
I dunno. It doesn't seem so bad. The solution can be found in a similar way to the earlier problems, by seeing it as ways of placing 3 blue cars in between 40 red cars (thereby dividing the space up into 4 sets of red cars that sum to 40). There are 43!/(40!3!) = 12341 ways of doing that.
Thanks for using my car example :D but I am afraid in this case that will not yield the correct result. Best seen by the fact that an integer can be used more than once (drawing with putting back)
On September 25 2018 01:24 The_Templar wrote: IMO everyone is way overcomplicating this problem. First, you should reduce the problem to a+b+c+d =37 with 0 ≤ a ≤ 13, 0 ≤ b ≤ 20, 0 ≤ c ≤ 25, and 0 ≤ d ≤ 24, which is functionally the same thing. Assuming you know how to find the number of solutions to a+b+c+d = 37 with no restrictions, you can use exclusion/inclusion to subtract the cases where a > 13, b > 20, c > 25, d > 24.
Reducing is an elegant idea I like it. The second part of your answer however is not how Inclusion - Exclusion works. Without going into math details, let me try to explain: You dont count good cases and bad cases and then substract the bad cases. You count different combinations of sets of good cases and then substract the number of good cases you counted more than once. Imagine you have you want to count fruit you bought but you are a bit of a dimwit so you only know that you bought 8 apples (A) and you bought 7 green fruit (B) and that you have 3 green Apples (A ∩ B) . Then the total number of fruit is 8+7-3=12. You count all apples and all green fruit, but then you counted the green apples twice, hence you substract that number. Obviously if you have more than 2 sets (which might be the case here ) then you count some stuff thrice etc.. formulas for those cases are very intuitive and can be found everywhere (wikipedia eg). What does this mean for our case here? We dont want to count cases that are not valid solutions to our equation (and substract them later or sth. like that; This can work to get to a solution but its not inclusion-exclusion) Hence counting all solutions to a+b+c+d=40 or 37 is not the way to go.
And just after writing all of this I realize that I am dumb and tired. I'd delete it but I will let it stand as a warning to all others. Im going to bed good night. sorry ;D
I think The_Templar is on the right lines. Let S = set of solutions for a+b+c+d=37 A= solutions with a<=13 B = solutions with b<=20 C = solutions with c<=25 D = solutions with d<=24 then you are looking for the intersection of A,B,C,D.
The number of solutions that satisfy all of those inequalities is |S| - those which fail at least one; |AnBnCnD| = |S| - |S\A u S\B u S\C u S\D|
You can use inclusion-exclusion principle on |S\A u S\B u S\C u S\D| because the intersections are small and often empty (for example, (S\A)n(S\C) requires both a>13 and c>25 but then a+b+c+d>38 which is not possible).
So mainly you need to know |S\A| etc and the condition a>=14 can be reduced to a+b+c+d=23, with a,b,c,d>=0.
On September 24 2018 05:24 Simberto wrote: 4 dimensional space with integers as elements in each dimension. An element might look like this
(1,4,291,11)
So basically, you have a set out of sets of 4 numbers. I don't know if this actually makes the thing easier to solve though.
Edit to your edit:
I was suggesting finding all a, b, c, d with a+b+c+d = 40 (or rather the amount of such things)
And the amount of abcd which satisfies the second set of inequalities
And the amount of stuff in the union of those sets.
Then you can use inclusion exclusion to find the amount of stuff in the intersection of those sets.
This is an aside, but why do you call N^4 a space? It is not a vector space (because there are no additive inverses) and it is not interesting as a topological space unless you give it a weird topology, and there is no obvious metric to make it a metric space, and even if you do give it a topology I don't see how that would in any way help. This also means that 'dimension' is not well-defined. I don't find it helpful here to talk about spaces or dimension.
On September 25 2018 02:44 Melliflue wrote: I think The_Templar is on the right lines. Let S = set of solutions for a+b+c+d=37 A= solutions with a<=13 B = solutions with b<=20 C = solutions with c<=25 D = solutions with d<=24 then you are looking for the intersection of A,B,C,D.
The number of solutions that satisfy all of those inequalities is |S| - those which fail at least one; |AnBnCnD| = |S| - |S\A u S\B u S\C u S\D|
You can use inclusion-exclusion principle on |S\A u S\B u S\C u S\D| because the intersections are small and often empty (for example, (S\A)n(S\C) requires both a>13 and c>25 but then a+b+c+d>38 which is not possible).
So mainly you need to know |S\A| etc and the condition a>=14 can be reduced to a+b+c+d=23, with a,b,c,d>=0.
On September 24 2018 05:24 Simberto wrote: 4 dimensional space with integers as elements in each dimension. An element might look like this
(1,4,291,11)
So basically, you have a set out of sets of 4 numbers. I don't know if this actually makes the thing easier to solve though.
Edit to your edit:
I was suggesting finding all a, b, c, d with a+b+c+d = 40 (or rather the amount of such things)
And the amount of abcd which satisfies the second set of inequalities
And the amount of stuff in the union of those sets.
Then you can use inclusion exclusion to find the amount of stuff in the intersection of those sets.
This is an aside, but why do you call N^4 a space? It is not a vector space (because there are no additive inverses) and it is not interesting as a topological space unless you give it a weird topology, and there is no obvious metric to make it a metric space, and even if you do give it a topology I don't see how that would in any way help. This also means that 'dimension' is not well-defined. I don't find it helpful here to talk about spaces or dimension.
Yeah, that is smart. Hadn't thought of it that way. Furthermore, you can do inclusion-exclusion for the union too:
|S\A u S\B| = |S\A| + |S\B| - |S\A n S\B| and in addition to S\A n S\B being a fairly small set, you can simply repeat Templar's initial trick in reducing the problem again here. Neat!
Oh, and as you pointed out, any intersection of more than 2 is empty, so only need to worry about the pairwise factors. Otherwise you'd still have quite a lot of work to do.
Pushing this thread along, one question at a time. Thank you everyone for your contributions to my education. I have a graph theory question... not sure if this is in anyone's realm of knowledge here.
I was wondering if the following is the case for a hamiltonian graph.
Let's say we have a hamiltonian graph. Somehow, we just know it is. It has N nodes, and each node has a random amount of edges between 3 and n/2.
I do believe, and I think I remember having done it before, that I can systematically remove edges until every node has the same degree, simply by examining the count of how many times each node is in an edge, and removing the edges that are out of balance.
My question is - does doing this ever remove hamiltonicity of the graph, or is it always safe to do this when investigating hamiltonicity?
On September 26 2018 04:17 travis wrote: Pushing this thread along, one question at a time. Thank you everyone for your contributions to my education. I have a graph theory question... not sure if this is in anyone's realm of knowledge here.
I was wondering if the following is the case for a hamiltonian graph.
Let's say we have a hamiltonian graph. Somehow, we just know it is. It has N nodes, and each node has a random amount of edges between 3 and n/2.
I do believe, and I think I remember having done it before, that I can systematically remove edges until every node has the same degree, simply by examining the count of how many times each node is in an edge, and removing the edges that are out of balance.
My question is - does doing this ever remove hamiltonicity of the graph, or is it always safe to do this when investigating hamiltonicity?
In short: it ain't safe.
Simply assume 8 nodes, split in 2 groups of 4 nodes, all connected to each other. Then add 2 edges between the 2 groups so that the graph becomes hamiltonian. You could/would now remove the 2 connecting edges, thus leading to two isolated clusters.
Why does each node have a degree of at least 3, and not at least 2? Seems to me, you can safely remove all edges that aren't on the hamiltonian cycle, leaving each node with exactly 2 edges. Thus if any node has degree > 2 you can safely remove one. But obviously knowing which ones, requires you to find the hamiltonian cycle...
E: nvm. I misread. Deleted this paragraph. Rest is ok.
On September 26 2018 06:29 travis wrote: that's true, but there's 2 edges we can remove in that graph that *are* safe, too.
so I suppose my question should be, is it impossible to guarantee the existence of edges that are safe to remove?
See Acros post: if you reduce the graph down to the hamilton ring, thus every node having 2 edges, you always have a perfect solution
But if you want to enforce the 3+ edges condition on the the new graph, it is easy to create a counter example. e.g. Here you are absolutely forced to remove the connections between the 2 clusters, if you want to keep every node at 3+ edges.
edit: of course the left cluster is fully connected... blegh... forgot to put those in.
Hey guys I need your help, I used to know the answer to a math puzzle but I can't seem to figure it out anymore...
There is a room with 8 levers numbered 1-8 which are in a random (unknown) initial position. The guard is going to tell you a number between 1 and 8, then you're going to go in the room, look at the levers, move one and only one lever, and then leave the room. Then your friend goes in the room and he has to guess the number that the guard gave you. You can discuss strategy with your friend before the game starts.
What is the strategy to be 100% sure that your friend will find the right number ?
If I remember well, the answer had to do with logical gate XOR, but I can't figure it out anymore :'(
On October 18 2018 21:44 Geiko wrote: Hey guys I need your help, I used to know the answer to a math puzzle but I can't seem to figure it out anymore...
There is a room with 8 levers numbered 1-8 which are in a random (unknown) initial position. The guard is going to tell you a number between 1 and 8, then you're going to go in the room, look at the levers, move one and only one lever, and then leave the room. Then your friend goes in the room and he has to guess the number that the guard gave you. You can discuss strategy with your friend before the game starts.
What is the strategy to be 100% sure that your friend will find the right number ?
If I remember well, the answer had to do with logical gate XOR, but I can't figure it out anymore :'(
Question about this, are you *forced* to move a lever? Can you just leave the levers alone?
My current approach is to try find a nice partition of a hypercube (totally equivalent to binary strings of certain length differing at one digit, which is in turn totally equivalent to on/off levers of which you flip one) which might make the answer fall out (the answer will be in the form of "you can find a partition of all possible arrangements of levers into 8 classes such that you can always take a starting configuration, flip one lever, and wind up in any class of your choosing).
EDIT: nvm I think it doesn't matter if you are forced to move one.
On October 18 2018 21:44 Geiko wrote: Hey guys I need your help, I used to know the answer to a math puzzle but I can't seem to figure it out anymore...
There is a room with 8 levers numbered 1-8 which are in a random (unknown) initial position. The guard is going to tell you a number between 1 and 8, then you're going to go in the room, look at the levers, move one and only one lever, and then leave the room. Then your friend goes in the room and he has to guess the number that the guard gave you. You can discuss strategy with your friend before the game starts.
What is the strategy to be 100% sure that your friend will find the right number ?
If I remember well, the answer had to do with logical gate XOR, but I can't figure it out anymore :'(
Question about this, are you *forced* to move a lever? Can you just leave the levers alone?
My current approach is to try find a nice partition of a hypercube (totally equivalent to binary strings of certain length differing at one digit, which is in turn totally equivalent to on/off levers of which you flip one) which might make the answer fall out (the answer will be in the form of "you can find a partition of all possible arrangements of levers into 8 classes such that you can always take a starting configuration, flip one lever, and wind up in any class of your choosing).
EDIT: nvm I think it doesn't matter if you are forced to move one.
Yes you are forced, but like you said, it basically doesn't matter because if you find the correct method, there will always be a lever that won't change the class of the lever configuration. The problem is finding a smart way to partition the 8 dimension hypercube. The answer was fairly elegant and simple if that helps...
Number the levers 1-8 and make number so that you add the lever's number if it's ON, then modulo 8 and +1 to get a number from 1-8. Switching a single lever can change this number by any amount.
On October 19 2018 00:01 opisska wrote: Number the levers 1-8 and make number so that you add the lever's number if it's ON, then modulo 8 and +1 to get a number from 1-8. Switching a single lever can change this number by any amount.
Was my first guess too but it doesn't work...
Let's say you enter the room and the configuration is 1:ON 2-8:OFF and you want to make your friend guess the number 2. There's no lever that can make the sum = 2 [8].
On October 19 2018 00:01 opisska wrote: Number the levers 1-8 and make number so that you add the lever's number if it's ON, then modulo 8 and +1 to get a number from 1-8. Switching a single lever can change this number by any amount.
Was my first guess too but it doesn't work...
Let's say you enter the room and the configuration is 1:ON 2-8:OFF and you want to make your friend guess the number 2. There's no lever that can make the sum = 2 [8].
That's why it's modulo 8 and then +1, but the switches are numbered 1-8 not 0-7. Thus the sum is already 2. If you are forced to move a lever, you can always move the last one which has no effect (equals to 8).
edit: To elaborate, if you were able to not act at all, you would have 9 options, but you need only 8, so you can have that taken away, but you need to basically have one option equal to "it's already good" and that's the last lever with value 8
edit2: I am dumb, it would work exactly as well with switches numbered 0-7, only the first one would do nothing. Getting into the "modulo something" thinking is always hard
On October 19 2018 00:01 opisska wrote: Number the levers 1-8 and make number so that you add the lever's number if it's ON, then modulo 8 and +1 to get a number from 1-8. Switching a single lever can change this number by any amount.
Was my first guess too but it doesn't work...
Let's say you enter the room and the configuration is 1:ON 2-8:OFF and you want to make your friend guess the number 2. There's no lever that can make the sum = 2 [8].
That's why it's modulo 8 and then +1, but the switches are numbered 1-8 not 0-7. Thus the sum is already 2. If you are forced to move a lever, you can always move the last one which has no effect (equals to 8).
To elaborate, if you were able to not act at all, you would have 9 options, but you need only 8, so you can have that taken away, but you need to basically have one option equal to "it's already good" and that's the last lever with value 8.
The problem with your method is that not all options are possible.
If the configuration is 1:OFF, 7:ON 2:OFF, 6:ON 3.OFF, 5.ON 4:OFF The sum is 5+6+7=2[8]
if you switch lever 1: sum=3[8] if you switch lever 2: sum=4[8] if you switch lever 3: sum=5[8] if you switch lever 4: sum=6[8] if you switch lever 5: sum=5[8] if you switch lever 6: sum=4[8] if you switch lever 7: sum=3[8] if you switch lever 8: sum=2[8]
you'll notice there is no way to obtain sum=1[8] or sum=7[8] or sum=0[8]
If you needed to make any of these numbers, the method doesn't work.
On October 19 2018 00:01 opisska wrote: Number the levers 1-8 and make number so that you add the lever's number if it's ON, then modulo 8 and +1 to get a number from 1-8. Switching a single lever can change this number by any amount.
Was my first guess too but it doesn't work...
Let's say you enter the room and the configuration is 1:ON 2-8:OFF and you want to make your friend guess the number 2. There's no lever that can make the sum = 2 [8].
That's why it's modulo 8 and then +1, but the switches are numbered 1-8 not 0-7. Thus the sum is already 2. If you are forced to move a lever, you can always move the last one which has no effect (equals to 8).
To elaborate, if you were able to not act at all, you would have 9 options, but you need only 8, so you can have that taken away, but you need to basically have one option equal to "it's already good" and that's the last lever with value 8.
The problem with your method is that not all options are possible.
If the configuration is 1:OFF, 7:ON 2:OFF, 6:ON 3.OFF, 5.ON 4:OFF The sum is 5+6+7=2[8]
if you switch lever 1: sum=3[8] if you switch lever 2: sum=4[8] if you switch lever 3: sum=5[8] if you switch lever 4: sum=6[8] if you switch lever 5: sum=5[8] if you switch lever 6: sum=4[8] if you switch lever 7: sum=3[8] if you switch lever 8: sum=2[8]
you'll notice there is no way to obtain sum=1[8] or sum=7[8] or sum=0[8]
If you needed to make any of these numbers, the method doesn't work.
Okay, fuck me, I am dumb! I see the problem now, can't see how I did not see it at first. Long time since I did a puzzle like this the last time ...
So let a_1,...,a_8 denote the levers and they are either 0 or 1. x will denote xor. Notice the nice property of xor that if a or b change then also a x b will change. Now it is easy to tell our friend the formula for the number he needs:
first binary decimal = a_1 x a_2 x a_5 x a_7 second = a_1 x a_3 x a_5 x a_6 third = a_1 x a_4 x a_6 x a_7
So now if the levers at the beginning give the required number pull lever a_8 (so no decimals will change). If you want to change all three decimals pull lever a_1. If you want to change only the first/second/third decimal pull a_2/a_3/a_4, and if you want to change a particular pair of decimals, pull one of the levers a_5, a_6 or a_7, depending which pair of course.
This might make your thoughts go into a good direction for finding the solution, although it is not the full solution yet. It makes finding the solution a lot easier, tho, hence the spoiler:
I love this puzzle because thinking about it for a while makes you realise that there probably is no reason that this is abstractly any more "difficult" for 8 than for 4 or 2 levers and thinking about the solution for 2 levers actually can give you the correct idea for how to solve it with more levers.
Anyway, thank you for sharing that puzzle, it's just great! I had never heard it before and it just made me miss my rehearsal.
Lets define for i=1 to 8, P(i)=base 2 writing of i
for example P(5)=(1;0;1)
Then N is base 2 writing of the number your friend has to guess.
for example if you must guess 7, then N=(1;1;1)
Then define S=XOR P(i) for all levers i that are ON
for example if levers 1, 4 and 7 are ON, then S=(0;0;1) XOR (1;0;0) XOR (1;1;1) = (0;1;0)
Then define T=S XOR N and j such that P(j)=T
in our example T=(0;1;0) XOR (1;1;1) = (1;0;1) = P(5)
If you pull lever j, and ask your friend to calculate S2 (which is S once you've pulled lever j), he'll get S XOR T = S XOR S XOR N = N which is the number he needs to find !
in our example you pull lever 5, so he finds levers 1,4,5 and 7 ON so S2= (0;0;1) XOR (1;0;0) XOR (1;0;1) XOR (1;1;1) =(1;1;1) = P(7), so the number is 7 !
On October 19 2018 00:55 Joni_ wrote: This might make your thoughts go into a good direction for finding the solution, although it is not the full solution yet. It makes finding the solution a lot easier, tho, hence the spoiler:
I love this puzzle because thinking about it for a while makes you realise that there probably is no reason that this is abstractly any more "difficult" for 8 than for 4 or 2 levers and thinking about the solution for 2 levers actually can give you the correct idea for how to solve it with more levers.
Anyway, thank you for sharing that puzzle, it's just great! I had never heard it before and it just made me miss my rehearsal.
Actually trying for 2 levers missguided me xD since one can just write down all the cases and make the partition himself. But yeah my solution above can be easily generalised.
On October 19 2018 01:39 opisska wrote: Two levers have a much simpler solution though: ignore one lever, the other one is 1 up 2 down. That's easy enough even for me not to mess up
True! 2 levers has several solutions that cannot be generalised, maybe messing with 4 levers can guide one towards salvation. =)
I'm using a program that calculates equity for different ranges in poker (hold'em). Here's the data I have:
Number of tries (exhaustive): 403835094432
Range 1 wins: 142599616520 Range 1 ties: 10945429968 Range 2 wins: 140980819160 Range 2 ties: 11040244316 Range 3 wins: 108972219224 Range 3 ties: 2243569640
Range 1 wins %: 35,3113% Range 1 ties %: 2,7104% Range 2 wins %: 34,9105% Range 2 ties %: 2,7338% Range 3 wins %: 26,9843% Range 3 ties %: 0,5556%
Range 1 equity: 36,5978% Range 2 equity: 36,2087% Range 3 equity: 27,1934%
I'm trying to determine how the program got the equity from the data.
Obviously you get the win equity by going (number of wins*100)/number of tries, and the tie equity by going (number of ties*100)/number of tries.
My understanding is that the equity percent is (wins%) + (part of ties% that are ties with only one player/2) + (part of ties% that are ties with both other players/3) - (because when you win you get all the money, when you tie with one player you get half and when you tie with both you get a third)
I run into an issue because the program doesn't let me know how often you tie with one player and how often you tie with both players.
I've been playing around with the numbers and I got the right results.
I added wins for all players and (tie/2) for all players. I get slightly over 100%, because some of the ties should have been /3 instead of /2.
I took the amount that was above 100% and did that *2, then /3.
Then I substracted that result from the number of ties for one player
Then I did number of wins for that player + (that result/2), and I got the right answer. I don't understand why. Shouldn't I also have had to divide an amount by 3?
On November 18 2018 21:48 Nebuchad wrote: Hi guys, I still suck at math
I'm using a program that calculates equity for different ranges in poker (hold'em). Here's the data I have:
Number of tries (exhaustive): 403835094432
Range 1 wins: 142599616520 Range 1 ties: 10945429968 Range 2 wins: 140980819160 Range 2 ties: 11040244316 Range 3 wins: 108972219224 Range 3 ties: 2243569640
Range 1 wins %: 35,3113% Range 1 ties %: 2,7104% Range 2 wins %: 34,9105% Range 2 ties %: 2,7338% Range 3 wins %: 26,9843% Range 3 ties %: 0,5556%
Range 1 equity: 36,5978% Range 2 equity: 36,2087% Range 3 equity: 27,1934%
I'm trying to determine how the program got the equity from the data.
Obviously you get the win equity by going (number of wins*100)/number of tries, and the tie equity by going (number of ties*100)/number of tries.
My understanding is that the equity percent is (wins%) + (part of ties% that are ties with only one player/2) + (part of ties% that are ties with both other players/3) - (because when you win you get all the money, when you tie with one player you get half and when you tie with both you get a third)
I run into an issue because the program doesn't let me know how often you tie with one player and how often you tie with both players.
I've been playing around with the numbers and I got the right results.
I added wins for all players and (tie/2) for all players. I get slightly over 100%, because some of the ties should have been /3 instead of /2.
I took the amount that was above 100% and did that *2, then /3.
Then I substracted that result from the number of ties for one player
Then I did number of wins for that player + (that result/2), and I got the right answer. I don't understand why. Shouldn't I also have had to divide an amount by 3?
Let me simplify your problem for you. You have possible 7 real probabilities, 'hidden data'.
1 wins, rest lose = a 2 wins, rest lose =b 3 wins, rest lose =c 1 ties 2, 3 loses = d 1 ties 3 , 2 loses=e 2 ties 3, 1 loses=f 1 ties 2 ties 3 = g
You also have the individual perspectives: 1 wins, 1 ties, 1 loses = h,i,j 2 wins, 2 ties, 2 loses = k,l,m 3 wins, 3 ties, 3 loses = n, o, p
Now you know that the probability of 1 tying is the sum of probability of tying:
h = a; (d+e)=(x) i, g = (1-x) i
Here x is the fraction of ties that are 2-way instead of 3-way.
OK. Now let's talk about normalized payout (assuming each round is same money). Let's assume in a tie of N players the payout is 1/N. So if there is one winner, payout is 1, payout is 1/2 in a two way tie, and payout is 1/3 in a 3-way tie.
Now we have the real question: what is the payout player 1 shoud expect? It is E= a+(d+e)/2+g/3. This number is player 1's equity. It is also E = a + ix/2+i(1-x)/3.
Solving this for x one obtains finds 6(E-a)/i = x = 6(E-h)/i
This is the fraction of ties which were 2-way (obviously 1-x then are the fraction of ties which were 1-way.)
Now you can use this to determine (d+e) and g.
Let's summarize. You originally knew for each player what fraction he uniquely won and tied, as well as his payout. For each player, we broke that tied piece into two pieces: fraction of ties 2-way and 3-way. This depended on a single variable x. We solved for x using the payout function, and then can multiply the fraction back to determine the hidden info.
Thanks, that makes sense to me. I should have mentioned that we don't really have E (or at least the goal is to find E, cause that's not the exact percentage (full disclosure I have another set of ranges that also rounds down to 27,1934% and I wanted to find the complete number to know which of the two is actually ahead), but I think I got this from here:
i = ((d+e)/2)+(g/3) l = ((d+f)/2)+(g/3) o = ((e+f)/2)+(g/3) 100 = h+k+n+(d/2)+(e/2)+(f/2)+(g/3)
should be enough information to find the 4 variables d e f g, I believe. Will look more closely when I get back home.
Is it just a coincidence that I got the "right" result with my random attempts earlier?
this is poker, so what is in the pot. are there blinds, antes? what is the bet? is this allin preflop? is it just assumed this is a 1 on 1 on 1 scenario where each player contributes 33% of the pot?
On November 19 2018 08:33 travis wrote: well, there is missing information
this is poker, so what is in the pot. are there blinds, antes? what is the bet? is this allin preflop? is it just assumed this is a 1 on 1 on 1 scenario where each player contributes 33% of the pot?
Those informations aren't needed to calculate the equity vs range, they are needed to figure out how much equity you need to continue with the hand
When I was doing my master's in math education, I took an awesome course called Problem Solving. We met once per week, for a three-hour block of time, and our professor would give us exactly one new problem to work on. We'd collaborate and explore the problem, tinker with a bunch of different strategies, try to solve the problem in as many ways as possible, consider additional extensions to the problem, and then reflect on and analyze the entire process.
My favorite problem in that class- which has come to be my favorite math problem of all time- is The Doubling Problem. Today and tomorrow I'm giving it to my high school math students (as it's right before Thanksgiving break, so I'd rather have the students play around with an interesting problem like this instead of trying to force down a final lesson and risk them losing focus).
The Doubling Problem is remarkably simple to explore, as it's just based on addition, multiplication, and moving around digits of a number. Thanks to its low mathematics entry point, I have plenty of students (and not just honors/ AP/ high-level students) who are making progress in solving the problem (or have already solved it).
Without further ado, I challenge you to solve The Doubling Problem (please spoiler your answers):
Pick a positive whole number. We’re going to apply a special rule to this number: take the last digit of that number (the ones’ digit) and move it to the front of the number. For example: 1234 becomes 4123 because the 4 is moved to the front, 567 becomes 756, 9002 becomes 2900, etc. Can you find a positive whole number such that, when you apply this rule to it, the resulting number is double the original number?
On November 21 2018 03:36 DarkPlasmaBall wrote: When I was doing my master's in math education, I took an awesome course called Problem Solving. We met once per week, for a three-hour block of time, and our professor would give us exactly one new problem to work on. We'd collaborate and explore the problem, tinker with a bunch of different strategies, try to solve the problem in as many ways as possible, consider additional extensions to the problem, and then reflect on and analyze the entire process.
My favorite problem in that class- which has come to be my favorite math problem of all time- is The Doubling Problem. Today and tomorrow I'm giving it to my high school math students (as it's right before Thanksgiving break, so I'd rather have the students play around with an interesting problem like this instead of trying to force down a final lesson and risk them losing focus).
The Doubling Problem is remarkably simple to explore, as it's just based on addition, multiplication, and moving around digits of a number. Thanks to its low mathematics entry point, I have plenty of students (and not just honors/ AP/ high-level students) who are making progress in solving the problem (or have already solved it).
Without further ado, I challenge you to solve The Doubling Problem (please spoiler your answers):
Pick a positive whole number. We’re going to apply a special rule to this number: take the last digit of that number (the ones’ digit) and move it to the front of the number. For example: 1234 becomes 4123 because the 4 is moved to the front, 567 becomes 756, 9002 becomes 2900, etc. Can you find a positive whole number such that, when you apply this rule to it, the resulting number is double the original number?
This looks rather trivial tbh. It took me like 2 minutes in excel to find 105263157894736842. Obviously there more solutions (e.g. concatenating this number with itself, but you most likely can also use different starters, didnt check)
All I did was: r_0 = 0 p_0 = d d_i = (p_i*2)%10
r_(i+1) = (p_i*2)/10 (/ used in the integer sense) p_(i+1) = d_i+r_(i+1)
Abort when: p_x = p_0 AND p_(x-1) != 0 AND r_x == 0
For d you can put in any digit and see if it terminates. It doesn't for d=1, but it does for d=2 and gives aboves solution.
On November 21 2018 03:36 DarkPlasmaBall wrote: When I was doing my master's in math education, I took an awesome course called Problem Solving. We met once per week, for a three-hour block of time, and our professor would give us exactly one new problem to work on. We'd collaborate and explore the problem, tinker with a bunch of different strategies, try to solve the problem in as many ways as possible, consider additional extensions to the problem, and then reflect on and analyze the entire process.
My favorite problem in that class- which has come to be my favorite math problem of all time- is The Doubling Problem. Today and tomorrow I'm giving it to my high school math students (as it's right before Thanksgiving break, so I'd rather have the students play around with an interesting problem like this instead of trying to force down a final lesson and risk them losing focus).
The Doubling Problem is remarkably simple to explore, as it's just based on addition, multiplication, and moving around digits of a number. Thanks to its low mathematics entry point, I have plenty of students (and not just honors/ AP/ high-level students) who are making progress in solving the problem (or have already solved it).
Without further ado, I challenge you to solve The Doubling Problem (please spoiler your answers):
Pick a positive whole number. We’re going to apply a special rule to this number: take the last digit of that number (the ones’ digit) and move it to the front of the number. For example: 1234 becomes 4123 because the 4 is moved to the front, 567 becomes 756, 9002 becomes 2900, etc. Can you find a positive whole number such that, when you apply this rule to it, the resulting number is double the original number?
This looks rather trivial tbh. It took me like 2 minutes in excel to find 105263157894736842. Obviously there more solutions (e.g. concatenating this number with itself, but you most likely can also use different starters, didnt check)
All I did was: r_0 = 0 p_0 = d d_i = (p_i*2)%10
r_(i+1) = (p_i*2)/10 (/ used in the integer sense) p_(i+1) = d_i+r_(i+1)
Abort when: p_x = p_0 AND p_(x-1) != 0 AND r_x == 0
For d you can put in any digit and see if it terminates. It doesn't for d=1, but it does for d=2 and gives aboves solution.
Yup, that's a solution Of course, keep in mind that my students are merely using basic calculators and pencils and paper, rather than software and computer code
Found by starting with "The last number needs to be double the first, so lets start with 2". Then every digit in ahead of it needs to be double the previous one, and if that meant that you would need to go above 9, the excess one seeps over to the digit ahead of that. So 2 as last digit means it needs to be 42 42 means it needs to be 842 842 means 6842, with a 1 added to the next number, notated as (1)6842 in the future (1)6842 means (1)36842 (1)36842 means 736842 repeat until you reach a one in the first position, without any saved 1.
Found by starting with "The last number needs to be double the first, so lets start with 2. Then every digit in ahead of it needs to be double the previous one, and if that meant that you would need to go above 9, the excess one seeps over to the digit ahead of that. So 2 as last digit means it needs to be 42 42 means it needs to be 842 842 means 6842, with a 1 added to the next number, notated as (1)6842 in the future (1)6842 means (1)36842 (1)36842 means 736842 repeat until you reach a one in the first position, without any saved 1.
I'm trying to figure out how this algorithm works, https://en.wikipedia.org/wiki/Prime-counting_function, under the section "Algorithms for evaluating π(x)", using Legendre's inclusion-exlusion principle. For whatever reason I'm not getting the numbers I would expect.
Suppose x = 32. It says "p1, p2, p3, pn are the prime numbers less than or equal to the square root of x." Sqrt(32) = 5, so then it would be p1 = 2, p2 = 3, and p3 = 5. Now if I understand the formula (I'm not very good with formula notation), it seems to be then:
On November 21 2018 03:36 DarkPlasmaBall wrote: When I was doing my master's in math education, I took an awesome course called Problem Solving. We met once per week, for a three-hour block of time, and our professor would give us exactly one new problem to work on. We'd collaborate and explore the problem, tinker with a bunch of different strategies, try to solve the problem in as many ways as possible, consider additional extensions to the problem, and then reflect on and analyze the entire process.
My favorite problem in that class- which has come to be my favorite math problem of all time- is The Doubling Problem. Today and tomorrow I'm giving it to my high school math students (as it's right before Thanksgiving break, so I'd rather have the students play around with an interesting problem like this instead of trying to force down a final lesson and risk them losing focus).
The Doubling Problem is remarkably simple to explore, as it's just based on addition, multiplication, and moving around digits of a number. Thanks to its low mathematics entry point, I have plenty of students (and not just honors/ AP/ high-level students) who are making progress in solving the problem (or have already solved it).
Without further ado, I challenge you to solve The Doubling Problem (please spoiler your answers):
Pick a positive whole number. We’re going to apply a special rule to this number: take the last digit of that number (the ones’ digit) and move it to the front of the number. For example: 1234 becomes 4123 because the 4 is moved to the front, 567 becomes 756, 9002 becomes 2900, etc. Can you find a positive whole number such that, when you apply this rule to it, the resulting number is double the original number?
if there is a solution depends on the base I think. for base2 there should be none (if you don't allow leading 0s at least - else you have 01 -> 10). Does anyone have a clue if it is solvable for all bases bigger than a certain value or if iit is solvable for some and not solvable for others and if there is a pattern which influences that?
On January 11 2019 00:44 enigmaticcam wrote: I'm trying to figure out how this algorithm works, https://en.wikipedia.org/wiki/Prime-counting_function, under the section "Algorithms for evaluating π(x)", using Legendre's inclusion-exlusion principle. For whatever reason I'm not getting the numbers I would expect.
Suppose x = 32. It says "p1, p2, p3, pn are the prime numbers less than or equal to the square root of x." Sqrt(32) = 5, so then it would be p1 = 2, p2 = 3, and p3 = 5. Now if I understand the formula (I'm not very good with formula notation), it seems to be then:
On January 11 2019 02:01 Simberto wrote:Furthermore, what you are counting are not the prime numbers up to 32, but, as the article clearly states, Pi(x) - Pi(Sqrt(x)) + 1.
Preceding the formula is the statement "A more elaborate way of finding pi(x) is ..". That seems pretty clear that it should be a formula to find pi(x), which is as stated in the beginning of the article a function to count the prime numbers less than or equal to x. I'm not sure how I'm misunderstanding that.
It is a way of finding pi(x). The way is to use basic equation solving to turn
N = pi(x) - pi(sqrt(x)) + 1
into pi(x) = N + pi(sqrt(x))-1
which, given that you can calculate N and clearly know pi(sqrt(x)) since you used all of the primes smaller than sqrt(x) to calculate N, is a way of finding pi(x)
On January 11 2019 02:41 Simberto wrote: It is a way of finding pi(x). The way is to use basic equation solving to turn
N = pi(x) - pi(sqrt(x)) + 1
into pi(x) = N + pi(sqrt(x))-1
which, given that you can calculate N and clearly know pi(sqrt(x)) since you used all of the primes smaller than sqrt(x) to calculate N, is a way of finding pi(x)
Okay, I think I get it. But I don't think it's useful for what I need to do. Thanks!
On November 21 2018 03:36 DarkPlasmaBall wrote: When I was doing my master's in math education, I took an awesome course called Problem Solving. We met once per week, for a three-hour block of time, and our professor would give us exactly one new problem to work on. We'd collaborate and explore the problem, tinker with a bunch of different strategies, try to solve the problem in as many ways as possible, consider additional extensions to the problem, and then reflect on and analyze the entire process.
My favorite problem in that class- which has come to be my favorite math problem of all time- is The Doubling Problem. Today and tomorrow I'm giving it to my high school math students (as it's right before Thanksgiving break, so I'd rather have the students play around with an interesting problem like this instead of trying to force down a final lesson and risk them losing focus).
The Doubling Problem is remarkably simple to explore, as it's just based on addition, multiplication, and moving around digits of a number. Thanks to its low mathematics entry point, I have plenty of students (and not just honors/ AP/ high-level students) who are making progress in solving the problem (or have already solved it).
Without further ado, I challenge you to solve The Doubling Problem (please spoiler your answers):
Pick a positive whole number. We’re going to apply a special rule to this number: take the last digit of that number (the ones’ digit) and move it to the front of the number. For example: 1234 becomes 4123 because the 4 is moved to the front, 567 becomes 756, 9002 becomes 2900, etc. Can you find a positive whole number such that, when you apply this rule to it, the resulting number is double the original number?
if there is a solution depends on the base I think. for base2 there should be none (if you don't allow leading 0s at least - else you have 01 -> 10). Does anyone have a clue if it is solvable for all bases bigger than a certain value or if iit is solvable for some and not solvable for others and if there is a pattern which influences that?
Imho, those numbers should exist for all bases >2. Lets again take a quick look at the algorithm I also mentioned earlier, but adjust it to baseN:
We assume the last digit to be
p_0 = d,
and then continue:
r_0 = 0 d_i = (p_i*2)%N
r_(i+1) = (p_i*2)/N (/ used in the integer sense) p_(i+1) = d_i+r_(i+1)
We can abort the algorithm when we get p_x == p_0 = d and r_x == r_0 = 0.
Now this isnt necessarily terminating for any d. But since p_i is in {0,...,N} and r_x in {0,1} there is only a finite number of combinations possible, so there must be cycles at some point, where p_k == p_l && r_k == r_l && k < l obviously, we can now state that for the lowest such k and l it holds true that for all m>k p_m == p_(m+l-k) && r_m == r_(m+l-k)
This means we have a number of cycles. If there is a single m>k, where r_m == 0, we can take p_m as last digit of our number and generate a valid number in regards to the original question.
So all we are required to do is to proof, that in this cycle there is guaranteed to be a position where r_m != 1. And this is rather trivial (I will make some leaps here, but it should be obvious, if not I can detail this) as the only cycle where r_m can stay permanently at 1 is the cycle of p_k = N-1 and r_k = 1. But as in this case it is necessary that p_(k-1) = N-1 and r_(k-1) = 1. Now this violates the rule of k being the lowest k to build such a cycle. And as we originally started with a different pair (d,0), it is clear that we can never enter this cycle -> there must be another cycle. -> There is a solution.
PS: There is the additional condition, that p_(m-1) != 0. But it also can be shown, that this is only a problem for N=2, as in all other cases simply taking m+1 produces a valid solution.
Yeah, that was like watch that one movie and get lost in them for three hours. They're all so pretty. I saw a couple videos ages ago, and had no idea he continued to produce them at the same quality.
Some of them are on topics that I understand and problems I mechanically solve, but never thought about 2d or 3d interpretations of the rules.
I have been given the question "show that sigmoid(-x) + sigmoid(x) = 1
I am not sure.... what to show ? Do I need to know how the limits of x going to infinity and negative infinity, or going to zero, will make this equation go to 1?
(I can do that, it's just that another question asked the upper bounds and lower bounds of the question so i already showed the limits... so it just confuses me a bit)
Well, did you plug in the expressions with the e power? What it does in the limit doesn't matter because it should be true for all x, not x going to plus or minus infinity.
On February 11 2019 05:19 travis wrote: aand now I have another math question
sigmoid function is sigmoid(x) = 1/(1+e^(-x))
I have been given the question "show that sigmoid(-x) + sigmoid(x) = 1
I am not sure.... what to show ? Do I need to know how the limits of x going to infinity and negative infinity, or going to zero, will make this equation go to 1?
(I can do that, it's just that another question asked the upper bounds and lower bounds of the question so i already showed the limits... so it just confuses me a bit)
Differentiate that function and observe its derivative’s properties.
On February 11 2019 06:04 mahrgell wrote: You show that for ANY x the equation holds true. So just transform the equation until you have the same on both sides of the equal sign.
1984's Rescue On Fractalus is being remade. The planet is a giant fractal. Pretty cool how unique an experience they were able to create in 1984 on such limited hardware. I think it was playable on theAtari 400. It definitely worked on the Commodore 64. I think the Atari 400 only had 16K of memory while the "high powered" C64 had an amazing 64 Kilobytes of RAM! If this game only worked on the Atari 800 then it is not as big of a technical feat.
in 1984 1st person space games were not very good. Using a fractal as the planet surface is such a cool ballsy move. Man, back in the day LucasArts was pretty damn cool. Leave it to Disney to shut them down.
On January 15 2019 00:57 travis wrote: You guys may enjoy this
On November 18 2018 21:48 Nebuchad wrote: Hi guys, I still suck at math
I'm using a program that calculates equity for different ranges in poker (hold'em). Here's the data I have:
Number of tries (exhaustive): 403835094432
Range 1 wins: 142599616520 Range 1 ties: 10945429968 Range 2 wins: 140980819160 Range 2 ties: 11040244316 Range 3 wins: 108972219224 Range 3 ties: 2243569640
Range 1 wins %: 35,3113% Range 1 ties %: 2,7104% Range 2 wins %: 34,9105% Range 2 ties %: 2,7338% Range 3 wins %: 26,9843% Range 3 ties %: 0,5556%
Range 1 equity: 36,5978% Range 2 equity: 36,2087% Range 3 equity: 27,1934%
I'm trying to determine how the program got the equity from the data.
Obviously you get the win equity by going (number of wins*100)/number of tries, and the tie equity by going (number of ties*100)/number of tries.
My understanding is that the equity percent is (wins%) + (part of ties% that are ties with only one player/2) + (part of ties% that are ties with both other players/3) - (because when you win you get all the money, when you tie with one player you get half and when you tie with both you get a third)
I run into an issue because the program doesn't let me know how often you tie with one player and how often you tie with both players.
I've been playing around with the numbers and I got the right results.
I added wins for all players and (tie/2) for all players. I get slightly over 100%, because some of the ties should have been /3 instead of /2.
I took the amount that was above 100% and did that *2, then /3.
Then I substracted that result from the number of ties for one player
Then I did number of wins for that player + (that result/2), and I got the right answer. I don't understand why. Shouldn't I also have had to divide an amount by 3?
Hi again
Similar to the last one, but significantly more annoying, I got this:
Number of tries (exhaustive): 1267892619840
Range 1 wins: 791976687662 Range 1 ties: 12482408790 Range 2 wins: 159949046721 Range 2 ties: 10525724277 Range 3 wins: 159949046721 Range 3 ties: 10525724277 Range 4 wins: 135556038238 Range 4 ties: 15394856484
Range 1 wins %: 62,4640% Range 1 ties %: 0,9845% Range 2 wins %: 12,6153% Range 2 ties %: 0,8301% Range 3 wins %: 12,6153% Range 3 ties %: 0,8301% Range 4 wins %: 10,6914% Range 4 ties %: 1,2142%
Possibilities are:
Range 1 wins = a Range 2 wins = b Range 3 wins = c Range 4 wins = d Range 1&2 tie = e Range 1&3 tie = f Range 1&4 tie = g Range 2&3 tie = h Range 2&4 tie = i Range 3&4 tie = j Range 1&2&3 tie = k Range 1&2&4 tie = l Range 1&3&4 tie = m Range 2&3&4 tie = n Range 1&2&3&4 tie = o
a, b, c and d are known.
Because range 2 and range 3 have the same stats, I also know that e=f, i=j and l=m
Because of the specifics of those ranges, I also know that e=f=0, k=0 (every time range 1 ties with range 2 or range 3, it must also tie with range 4).
I'm getting:
Range 1 ties => g/2+l/3+l/3+o/4=0,9845
Range 2 ties => h/2+i/2+l/3+n/3+o/4=0,8301
Range 4 ties => g/2+i/2+i/2+l/3+l/3+n/3+o/4=1,2142
From this I get that h/2 = (100-a-b-c-d)-1,2142 = 0,3996
I can also add all of the ties and get:
g+2i+2l+n+o = 3,0598
And from there I have trouble getting the other five variables, and I'm not sure if it's because I suck at this or because I just don't have enough equations to solve...
Unless you have additional information, i don't think you can conclude:
Because range 2 and range 3 have the same stats, I also know that e=f, i=j and l=m
Because of the specifics of those ranges, I also know that e=f=0, k=0 (every time range 1 ties with range 2 or range 3, it must also tie with range 4).
Just because two ranges have the same win percentages does not mean that they are the same hand. Neither does this necessarily mean that they have the same combination tie percentages e and f (etc....), nor that always only one of them ties with 1.
You could have a distribution where 2 often ties with 1, but never with 4, and 2 often tie with 4, but never with 1 which delivers the same distribution.
Other than that, if you have problems solving something like that, put it into a matrix and use gaussian elimination. That is an algorithm that always solves a solvable set of equations. It's gonna be a bit bit annoying with that many coefficients, but will tell you the solution if there is a clear one, and the set of solutions if there isn't.
Because range 2 and range 3 have the same stats, I also know that e=f, i=j and l=m
Because of the specifics of those ranges, I also know that e=f=0, k=0 (every time range 1 ties with range 2 or range 3, it must also tie with range 4).
Just because two ranges have the same win percentages does not mean that they are the same hand. Neither does this necessarily mean that they have the same combination tie percentages e and f (etc....), nor that always only one of them ties with 1.
You could have a distribution where 2 often ties with 1, but never with 4, and 2 often tie with 4, but never with 1 which delivers the same distribution.
Other than that, if you have problems solving something like that, put it into a matrix and use gaussian elimination. That is an algorithm that always solves a solvable set of equations. It's gonna be a bit bit annoying with that many coefficients, but will tell you the solution if there is a clear one, and the set of solutions if there isn't.
Yeah I worded that poorly, I do have additional information. Range 2 and 3 contain the same hands so it's true that e=f etc., and the way the ranges are constructed it's impossible for range 1 to tie with range 2/3 without also tying with range 4, so those informations were correct.
I have a assigned question to determine the last two digits of 99^14 by using modulus 100 so I know 99 is congruent -1 (mod 100) which means 99^14 is congruent (-1)^14 (mod 100) which means the last two digits must be 01 because -1^14 = 1
On February 20 2019 10:34 travis wrote: I have a assigned question to determine the last two digits of 99^14 by using modulus 100 so I know 99 is congruent -1 (mod 100) which means 99^14 is congruent (-1)^14 (mod 100) which means the last two digits must be 01 because -1^14 = 1
did I do this correctly?
Hi - yes you did.
You can check this property because if a = bm+c and g = hm+k, then ag = (bm+c)(hm+k) = (hbm+ch+bk)m+ck. Thus ag (mod m) \equiv ck (mod m).
Applying this to your problem, you have 14 99s, and for each one, c = k = -1 (without loss of generality). So you combine the first two, and get that it is equivalent to 1, then you take 99^2 and 99 together. Well 99^2 \equiv 1 (mod 100) so that means that it's remainder must be 1. So now you have c = 1, k= -1. Thus 99^3 \equiv -1 (mod 100). We can use induction to show that what you said is correct.
I've got some more review checking to do, this one is probability.. I've always been especially bad at probability, it confuses me pretty badly sometimes.
the question says that a student takes an hour long exam (so x is time, where x is {0,1}), with a probability of finishing within time x = x/2.
It then says that the student is still working after .75 hours, and what is the probability the student will take the entire hour to finish?
edit: I removed the crap I wrote because I realized I must be doing this question wrong... I will come back and fill in what I try doing
edit2:
I think I figured it out. I need P(x=1 | x>=.75)
which = P(x >= .75 and x = 1)/P(x >= .75)
which = P(x = 1)/P(x >= .75)
which are
(1-1/2)/(1-.75/2) = .8
the key was I forgot I need to subtract both values from one because the original probability was the chance of finishing it in that time, not of going past that time. I need to pay more careful attention when I am reading
On February 21 2019 04:26 travis wrote: Thanks Rodya.
I've got some more review checking to do, this one is probability.. I've always been especially bad at probability, it confuses me pretty badly sometimes.
the question says that a student takes an hour long exam (so x is time, where x is {0,1}), with a probability of finishing in time x = x/2.
It then says that the student is still working after .75 hours, and what is the probability the student will take the entire hour to finish?
I've tried to work it out and I've tried to get some guidance online but I am not making sense of it
What I did is:
P(A) = .75/2 P(B) = 1/2
I think I want P(B|A)
And I think P(B|A) = P(A and B)/P(A)
But that's where I am getting really confused. Isn't P(A and B) = P(B) ?
Which gives
P(B|A) = (1/2)/(.75/2)
which is 1.33333.. which is greater than 1. What am I doing wrong lol
Took me a few minutes to understand. Just to clarify, let me state what I guess the unwritten assumptions are: You meant to say P(t <= x)=x/2 where t is the time of finishing the the exam for x in [0,1) (instead of {0,1})? And that P( t= 1)= 1/2, because the student is forced to finish the exam once the hour has passed?
If A = "student still working at 0.75 hours" and B= "student takes full hour" then you are indeed looking for P(B|A). However, when calculating the probabilty of A, you are apparently confusing A with its complementary event.
On February 21 2019 04:47 travis wrote: Yeah, your assumptions are correct, sorry for having you read through that only to find I had edited it afterwards.
No problem, I have to get back into this stuff anyway, otherwise I wouldnt even have bothered
You are right but Id mark off a point for your not so right derivation. Note first that f(x) = x/2 is the cdf of the finish-time variable, which I will call y. The pdf of y must be g(x) = 1/2. Probability must sum to 1 so that means the integral of (1/2) over the domain of g should be 1. So that means g ranges from 0 to 2. Thus there is a 50% chance that the exam is not completed within the hour.
Now, p(y=1 |y>=.75) = p(y=1)/p(y>=.75) = (1/2)/(5/8) = 0.8. The calculation is helped by the fact that p(y>=.75) = 1 - p(y< .75) = 1-p(x=.75). You did the right numbers but you got them the wrong way. P(x>=.75) = the integral from .75 to 2 of x/2. That comes out to 23/32.
The best way to think about your problem is a hybrid distribution with both continuous and discrete components. Namely you have a continuous probability density on [0,1) and then a discrete probability mass of a half at 1. In terms of the cdf function that would be F(x) = x/2 for x in [0,1) and then a jump at 1 so F(x) = 1 at x=1. It doesn't make any sense to try to extend your sample space to [0,2] in the manner that Rodya is saying.
On February 21 2019 05:49 phwoar wrote: The best way to think about your problem is a hybrid distribution with both continuous and discrete components. Namely you have a continuous probability density on [0,1) and then a discrete probability mass of a half at 1. In terms of the cdf function that would be F(x) = x/2 for x in [0,1) and then a jump at 1 so F(x) = 1 at x=1. It doesn't make any sense to try to extend your sample space to [0,2] in the manner that Rodya is saying.
I agree with Rodya for subtracting the point. And I agree with you that this makes more sense as having a continuous and a discrete component.
Travis, if you don't understand your mistake, try doing the exact same exercise with the cdf F(x) = x/3.
You got lucky that 1 hour was the halfway point, and therefore P(x=1) = P(x < 1) = 0.5. With the cdf x/3 this is no longer the case, and you have to use P(x = 1 | x >= 0.75) = P(x = 1) / P(x >= 0.75) = 2/3 / 5/8, which is different from P(x < 1 | x >= 0.75) = 1/3 / 5/8....
I need to know how to differentiate matrices... but this is something we've literally never done, and everything I am reading online is tough to follow, and i am confused to heck.
Starter problem:
compute the derivative of the function f(w) with respect to w_i, where w,x are in R^D and:
f(w) = 1/(1+e^(-w^t*x))
I honestly have no idea what to do. What's got me confused here is "with respect to w_i". I am expecting this problem is actually simpler than the later problems, but I am not sure what it's asking for. It's asking for the derivative of a given element of w ? god i can't wrap my head around this I don't have a strong enough basis in calculus
edit: am I literally just finding the derivative of 1/(1+e^(-x * w_i))?
edit2: or is it the summation of said derivatives?
On February 22 2019 00:42 travis wrote: I see the mistake, thanks!
Now on to something a bit more less common.
I need to know how to differentiate matrices... but this is something we've literally never done, and everything I am reading online is tough to follow, and i am confused to heck.
Starter problem:
compute the derivative of the function f(w) with respect to w_i, where w,x are in R^D and:
f(w) = 1/(1+e^(-w^t*x))
I honestly have no idea what to do. What's got me confused here is "with respect to w_i". I am expecting this problem is actually simpler than the later problems, but I am not sure what it's asking for. It's asking for the derivative of a given element of w ? god i can't wrap my head around this I don't have a strong enough basis in calculus
edit: am I literally just finding the derivative of 1/(1+e^(-x * w_i))?
edit2: or is it the summation of said derivatives?
The 'derivative' of a linear transformation matrix A is itself.
I'm guessing w^t is the transpose of w? And I suppose the * in w^t * x is the dot product? For future reference, most people would prefer you write w^T not w^t. Allow me to write out the function more explicitly for you.
I still don't understand. Up to this point it's pretty much what I thought, but I still don't get what it means to take the partial derivative with respect to w_i
if I do that, how do I even represent the rest of the vector entries in the derivative?
okay fantastic. Now is where it gets a lot harder. I will be back soon lol I really appreciate the help, they really threw us in the deep end here, I think the upcoming questions are typically graduate level stuff and im not even good at math for an undergrad
it's for a deep learning course, if you're curious. the course isn't even really a theory class or anything but I guess it's good for me to get some understanding of this stuff
That means, if w is a column vector, w^T is the row vector with the same numbers
(w1) w2 w3
becomes (w1, w2, w3)
That means that w^T x is simply the scalar product of w and x. The reason people use this way of writing is that you don't have to specifically define the scalar product which you want to use, and can instead use the normal rules of matrix multiplication. (And it is more generally useable)
Firstly, you are right that the sum of derivatives is equal to the derivative of the sum. One way to remember this is by remembering that the derivative operation is *linear*.
Secondly, w^T*x^(i) basically means the dot product of w and x^(i). They are both vectors, so taking the dot product makes sense. The reason one of them is transposed is so that you can ignore dot products and just do matrix multipication (yes vectors are matrices). For your example we have w^T*x^(i) = 3(0.4) + 7(0.6).
Thirdly, when we talk about the derivative of a function with respect to a vector, we are talking about the gradient of that function. You can google it, but the gradient is just a vector containing all the first order partial derivatives of the function. I would be wrong about this, however, if your book has decided to use the word "derivative" to mean "total derivative".
Fourthly, the derivative that you gave is close to being correct. You are right that the funcitonal form of the derivative of |f(x)| is equal to f'(x)f(x)/|f(x)|, however you need to take into account the third point (so your answer should be a vector). Also, you may want to know that this derivative may not defined at certain points w for which wx^(i)-y^(i) = 0. But wherever the derivative is defined... you will have d|f(x)|/dx = f'(x)f(x)/|f(x)|.
Lastly, a couple other problems with your answer. w^(i) doesn't make sense - only x and y are indexed by i. x^(i) is a vector, not the i'th entry of x. The j'th entry of x^(i) is denoted (x^(i))_j. Also, that x on the far left of your answer should be (x^(i))_j or something, since you are taking a PARTIAL derivative with respect to w_j, not w itself.
I said a lot but hopefully I didn't confuse you, you seem to pretty much get it. The third point I wrote is really the only important one.
On February 22 2019 06:41 Rodya wrote: Firstly, you are right that the sum of derivatives is equal to the derivative of the sum. One way to remember this is by remembering that the derivative operation is *linear*.
Secondly, w^T*x^(i) basically means the dot product of w and x^(i). They are both vectors, so taking the dot product makes sense. The reason one of them is transposed is so that you can ignore dot products and just do matrix multipication (yes vectors are matrices). For your example we have w^T*x^(i) = 3(0.4) + 7(0.6).
I said a lot but hopefully I didn't confuse you, you seem to pretty much get it. The third point I wrote is really the only important one.
you definitely aren't making my confusion worse, only better, but this is the part that I don't understand.
is my confusion here just from a lack of understanding of the notation. I am wondering if this is somethign you guys are trying to tell me.
Lastly, a couple other problems with your answer. w^(i) doesn't make sense - only x and y are indexed by i. x^(i) is a vector, not the i'th entry of x. The j'th entry of x^(i) is denoted (x^(i))_j.
to further clarify my confusion, if x^(i) is a vector, then we are saying that x is potentially a > 1 dimensional matrix (so it potentially has more than one vector)? and i is the column we are multiplying w by?
so if that's the case then in this problem, W may or may not have more than one row? but then our output is a vector, not a scalar? (I think you said that)
edit:
to check another problem, it asks, "find delta_w of f, where f(w) = tanh[w^Tx]"
I am guessing the delta_w of f means "derivative of f with respect to w".
so to answer this I did chain rule u = w^Tx
d/dw(w^Tx) = x (I looked it up.. I don't know why it's x ... im guessing order of multiplication matters here. not really going to spend the time going into it)
so anyways, answer = x*sech^2[w^Tx]
did i screw this one up or is it really that straightforward?
gonna try doing the last two next, they are clearly the most difficult
Okay, I see what you were doing. Notice that x*sech^2(w^T x) is a vector - this is what you're saying is the gradient of f in this other problem. And that is indeed the correct answer.
Now going back to the original problem, let me just write the answer so you can see the differences: (1/2)(sum from i to m)x^(i)(w^Tx^(i)-y^(i)) / |w^Tx^(i)-y^(i)|. Apologies for misreading your answer. You are using some shortcuts to calculate the whole gradient at once. Which is fine, but I wasn't expecting it.
Let me give you a quick calculus review. So, the output of f(w) is a number (it looks like a loss function). However, f(w) is a multivariable function, it has as many variables as there are elements in w. Now, I'm assuming that w is a vector, not a matrix, although it definitely could be, lets assume for now that it is a vector. You talk about the derivative of f with respect to w, however think about the function g(z) = z_1+2z_2+3z_3 where z=(z_1, z_2,z_3) is some vector. Now you probably can guess that the 'derivative' of g is equal to (1,2,3), but strictly speaking, you can only take a derivative with respect to a single dimensional variable (i.e. something that varies over the real numbers and therefore isnt a vector). Thus we can't calculate dg/dz, we can only calculate dg/dz_1, dg/dz_2, and dg/dz_3. So we simply DEFINE dg/dz to be the VECTOR (dg/dz_1, dg/dz_2, dg/dz_3), which you can easily check is equal to (1,2,3). Similarly, the derivative of f(w) is defined to be the vector (df/dw_1, ..., df/dw_D). You can check your work and see that the answer I gave you can be found using either your method or the method of partial derivatives. I'm actually lying a bit to you here, but it's very useful (and I'm sure your teacher expects you) to think of the derivative of g/f this way. It will help to demystify some of your derivations and let you check your work using basic calculus.
Earlier you mentioned that you were anxious about taking derivatives of matrices - so since you've mentioned it again, let me briefly justify the fact that d/dw (w^T x) = x. Let x be an arbitrary matrix, then we can think of it as a linear transformation that maps vectors w to xw. Therefore, if we denote that linear transformation by f(w), we have f(w) = xw. Now, a derivative is supposed to be the best linear approximation of a function at each point. But a linear transformation is linear everywhere, so it is ITSELF the best linear approximation to ITSELF at every point. Thus the best linear approximation (i.e. the derivative) of f(w) is x. So d/dw(f(w)) = x. Hopefully it gives you some peace of mind at least, even if it isn't a rigorous argument.
Oh, and you asked about whether since x^(i) is a vector, that x is a matrix. You haven't defined x yet, I have no clue what it is. It's possible that in your book, the notation used is that x^(i) is the i'th column or row of x. If that is the case, then x would be a matrix. If your book does not use that notation though, then x isn't anything - it is undefined, so don't think about it and don't let it appear in your derivations unless you define it first.
I solved the last two derivatives on my own... they were horrible and long but they used the same concepts.
In other news, I am being forced to turn my homework in using latex, and it includes graphs and stuff. This does not seem worth it, it is making my homework take SOOOO long to do, and I expect that people outside of academia don't really end up using this so I am not sure why the professor wouldn't let us write it on a damn piece of paper
If your professor is okay with it, you can cut down the time it takes to put graphs in the document by using snipping tool (or screen shot) to take a picture of the graph, then use \includegraphics{mypicture.png} to insert the graph crudely into the doc. You can also manually adjust the size of the picture.
You gotta include \usepackage{graphicx} in the preamble though to do that.
Also, http://detexify.kirelabs.org/classify.html is your best friend. Just draw any symbol and it will tell you how to make it, and if you need a special package to do so.
Latex is great. Drawing graphs with latex is absurd. Just use whatever and include them as images. Tikz is awesome for making pretty diagrams, but there are much better and easier ways of drawing graphs.
Yes, nobody outside of (computer) science uses it, but you're being the trained as a computer scientist, so I don't see the problem with learning it. We had to learn it in 1st year CS, and once you get the hang of it, you'll never want to use Word again ever.
Well I don't use word either(unless I am forced to), I just write stuff on paper and then scan it to pdf. As long as the handwriting is legible I can't imagine latex being superior to that, but maybe I am wrong?
I suppose I would want it for publishing any papers, though...
On February 23 2019 10:49 travis wrote: Well I don't use word either(unless I am forced to), I just write stuff on paper and then scan it to pdf. As long as the handwriting is legible I can't imagine latex being superior to that, but maybe I am wrong?
I suppose I would want it for publishing any papers, though...
"As long as the handwriting is legible" is the key point there, though. Remember that your professor is going to have to correct and grade this work for everybody in your class, not just you.
LaTeX is one of my hidden shames, since I use it quite a bit but have never actually spent the time to sit down and learn how it really works. In that sense, a lot of it is 'magic' to me since I mostly know the proper rituals and incantations but would have a hard time rebuilding my collection of useful patterns from scratch.
For graphs (if we're talking figures) I've never used LaTeX directly. I usually use gnuplot (ugly) and then include the PNG into the LaTeX document. There's much nicer graphing libraries out there, I've just never spent the time to find them.
For drawing graphs (if we're talking graph-theory) I've mostly used the Tikz library, as mentioned. I found the Tikz automata library very useful for drawing things like DFAs, NFAs, etc. in courses that required it.
In some math electives I took I ended up using LaTeX for all the coursework. The profs seemed to appreciate it, and I found that it actually cut down on the writing time overall.
I wanted to verify a true or false section on my homework that I did, it's abstract algebra groups. I just want to make sure I am understanding this stuff correctly. Note that the * denotes that the set does not contain 0
T/F:
1.) Q(under addition) us a subgroup of R (under addition): true
2.) R* (under addition) is a subgroup of R (under addition): false (no identity in R*)
3.) R* (under multiplication) us a subgroup of R (with addition): false (they need to be same operation right?)
4.) {1,-1} (under multiplication) is a subgroup of R* (under multiplication): true
5.): {2^k : k in Z} (under multiplication) is a subgroup of Q* (under multiplication): false (no zero in Q* so no inverse for 2^k... am I understanding that right?)
On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true.
While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup.
On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true.
While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup.
For the multiplication the 1 is the same as the 0 for the addition.
On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true.
While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup.
For the multiplication the 1 is the same as the 0 for the addition.
Yes... so?
A group is not a subgroup of an other just because it is a group and the elements are a subset. It has to include the identity.
Including the identity is not enough either, it definitely needs to be a group under the same operator, so can't simply redefine the group operator in the subgroup. So (3) fails on both those points.
On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true.
While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup.
For the multiplication the 1 is the same as the 0 for the addition.
Yes... so?
A group is not a subgroup of an other just because it is a group and the elements are a subset. It has to include the identity.
Including the identity is not enough either, it definitely needs to be a group under the same operator, so can't simply redefine the group operator in the subgroup. So (3) fails on both those points.
At least we learned a more theoretical version, where it is enough to be isomorph to a subgroup (by the your/wiki definition) of supposed supergroup to be considered a subgroup in a group theoretical point of view. How you name your elements and ops really doesn't matter then. And then obviously your isomorphism images the identity of one group onto the identity of the other group.
Now there is a trivial isomorphism (ln(x)) between (R+\{0}, *) and (R,+).
This is in fact, what led me to my initial thought. But at least right now, I can't expand this to {R\{0},*}. Which leads me to believe that this is indeed false, just for very different reasons.
I think (3) is highlighting the issue about the operator and not the identity (Travis is correct, different operator so not a subgroup). Needing the identity was the purpose of example (2).
(5) is true and I don't know where the confusion came from. The lack of zero in Q* is irrelevant. The inverse for 2^k is 2^(-k) so you are fine when k ranges over Z.
Edit: To the above, being isomorphic to a subgroup is not the same as being a subgroup. There are times when the distinction is not important so we treat isomorphic subgroups as subgroups (mathematicians are often lazy like that). Sometimes however it can cause problems and it is important to remember they are different.
ah right, the identity in 5.) is 1.. not 0. the confusion was that I was thinking 2^0 = 1 .. temporarily forgetting that the member of my group is 2^k, not k. if that confuses you don't worry about it, my thinking clearly didn't make sense. glad I posted them though!
When you look at the definition of a subgroup in wiki, "(G,*) is group. A subset H of G is called a subgroup of G if the restriction of * to H x H is group operation on H.", you dont see any mention of a 'preexisting' binary operation on H itself. So when answering the question, is R* (under multiplication) a subgroup of (R,+), u can disregard the given operation (multiplication) totally. What you have to show is: (i) R* subset of R? (ii) ( R* , + ) group? (with + being the restriction of + mentioned in the definition of a subgroup.)
R* is no subgroup of ( R , +) because the identity element is missing. (And for no other reason.)
@mahrgell: ln : R+\{0} -> R is a semigroup-homomorphism between ( R+\{0} , * ) and ( R , + ). But no Monoid-homomorphism because ln(1) = e \neq 0, i.e. the identity element of ( R+\{0} , * ) is not mapped to the identity element of ( R , + ). So its definitively no group-isomorphism between ( R+\{0} , * ) and ( R,+).
@mahrgell: ln : R+\{0} -> R is a semigroup-homomorphism between ( R+\{0} , * ) and ( R , + ). But no Monoid-homomorphism because ln(1) = e \neq 0, i.e. the identity element of ( R+\{0} , * ) is not mapped to the identity element of ( R , + ). So its definitively no group-isomorphism between ( R+\{0} , * ) and ( R,+).
On February 27 2019 11:13 travis wrote: ah right, the identity in 5.) is 1.. not 0. the confusion was that I was thinking 2^0 = 1 .. temporarily forgetting that the member of my group is 2^k, not k. if that confuses you don't worry about it, my thinking clearly didn't make sense. glad I posted them though!
Nope, I understand what you did. It was an easy enough mistake to make.
The purpose of these simple examples is to help you get a feel for the definitions and what they mean/represent. It can be very confusing when definitions are given without the motivation behind them, and this can happen with purely algebraic objects like groups. So it is good to play around with some examples, and why I didn't want to leave only tell you that you were wrong, because that does not help you understand
I have an example on my homework I missed. it asks about symmetric/transitive/reflexive
question was to classify the problem as each and if it's not then give a counterexample
the problem I missed was:
m,n in Z
m~n if m*n > 0
I said it was reflexive, symmetric, and transitive I got the problem wrong, it says its not reflexive because x=0 does not work but I thought the entire point was that you pick an m,n such that it fulfills the equation....
I guess maybe I am misunderstanding, and the set is not defined by that relationship...? Instead it's just like a test of whether or not the relationship passes?
So the task is to check wether the relation R \subset Z x Z, i.e. {(m,n) \in Z x Z | m*n > 0} is reflexive, symmetric, transitive:
reflexive: R is reflexive if for all m \in Z: (m,m) \in R. R is not reflexive because 0 \in Z and 0*0 = 0. So (0,0) \notin R. For (0,0) to be element of R the following had to be true 0*0 = 0 > 0. Which is obviously false.
symmetric: R is symmetric if for all m,n \in Z: (m,n) \in R => (n,m) \in R. Suppose (m,n) \in R. => m*n>0 (* is commutative) => n*m>0 => (n,m) \in R. So R is symmetric.
transitive: R is transitive if for all m,n,l \in Z: (m,n) , (n,l) \in R => (m,l) \in R. Suppose (m,n) , (n,l) \in R. => m*n>0 and n*l>0 => m*n>0 and n*l>0 and (m,n,l > 0 or m,n,l < 0) [the cases m,n>0 n,l<0 and m,n<0 , n,l>0 are contradictory, so you can disregard them.] => (m*n>0 and n*l>0 and m,n,l > 0) or (m*n>0 and n*l>0 and m,n,l < 0) => m*l >0 => (m,l) \in R. So R is transitive.
I feel bad about using this thread but maths are hard for me, sorry ^.^
We start from a group of combinations, in this case 296 combinations.
I want to assign each combination a different frequency, in this case there will be three frequencies (25%, 50% and 75%).
If I go 296/3 and give 25/50/75 to each of those three groups respectively, I'll have a total frequency of 50% obviously. I'd like to have a different total frequency, in this case 66%.
66% is 2/3, 25% is 1/4, 50% is 2/4 and 75% is 3/4. So I have groups of 3/12, 6/12 and 9/12 and I'm trying to get to 8/12 for my total. There are multiple solutions. I'm trying to find the most "harmonious" one (pretty bad term to use but by that I mean, the 33%-33%-33% solution for a 50% total frequency is more "harmonious" than, say, 1% at 25, 98% at 50 and 1% at 75, even though that would also give me 50%).
On April 08 2019 21:52 Nebuchad wrote: I feel bad about using this thread but maths are hard for me, sorry ^.^
We start from a group of combinations, in this case 296 combinations.
I want to assign each combination a different frequency, in this case there will be three frequencies (25%, 50% and 75%).
If I go 296/3 and give 25/50/75 to each of those three groups respectively, I'll have a total frequency of 50% obviously. I'd like to have a different total frequency, in this case 66%.
66% is 2/3, 25% is 1/4, 50% is 2/4 and 75% is 3/4. So I have groups of 3/12, 6/12 and 9/12 and I'm trying to get to 8/12 for my total. There are multiple solutions. I'm trying to find the most "harmonious" one (pretty bad term to use but by that I mean, the 33%-33%-33% solution for a 50% total frequency is more "harmonious" than, say, 1% at 25, 98% at 50 and 1% at 75, even though that would also give me 50%).
Just to rephrase, you want: 0< x, y < 1 (1/3 *(1 - x))/4 + (1/3*(1 - y))/2 + 3 * 1/3(x + y - 1)/4 ) = 2/3
Or, equivalently: 2x + y = 8 with 0 < x, y < 12
And you want the solution that minimizes x, y, and 12 - x - y simultaneously. There are probably many ways of defining that harmoneousness, but lets say MSE does that, you want to minimize x^2 + y^2 + (12 - x - y)^2 = 2x^2 + 2y^2 + 144 + 2xy -24x - 24y
Didn't really read the post above me but looks like he doesn't think there's a solution? The problem is 3x+6y+9(1-x-y) = 8. It's clear that the LHS taken as a function is smooth and it's clear that (1,0) evaluates to 3 and (0,0) evaluates to 9, so by the IVT there is some (x,y) which gives us 8. The only question is if its obtainable with only a collection of 296 things.
This can obviously be solved in mathematica in two seconds, but I am guessing this is a homework problem so he can't do that.
First just solve for y to get y = (1-6x)/3. Now we need to choose an x so 296x is an integer, and 296y is an integer. So x = z/296 for some integer z. Thus we have y = (1-6(z/296))/3. Note that we can choose any z we like here. The only thing left to do is to find a z so that y = w/296 for some integer w.
If you follow the above logic you'll be able to figure out that we want to find an integer w = (1/3)296-2z for some integer z. In otherwords, we want a solution to the equation w+2z = (1/3)296. But this clearly has no solution, since the gcd(1,2) = 1 and (1/3)296 is not an integer, so its obviously not a multiple of 1. Therefore your problem has no solution. Why? Because if we cant choose a z that solves this equation, then there are no x which 1) solve 3x+6y+9(1-x-y) = 8 and 2) satisfy 296x is an integer.
Just a quick note: please if you don't know whether or not your problem has a solution - say so! It's kind of annoying to have to prove it for myself. I would have tried to estimate a good guess right away, but I gotta go.
edit: I fixed this post. I originally had the problem set as 3x+6y+9(1-x-y) = 4 instead of = 8.
On April 09 2019 00:35 Rodya wrote: Didn't really read the post above me but looks like he doesn't think there's a solution? The problem is 3x+6y+9(1-x-y) = 4. It's clear that the LHS taken as a function is smooth and it's clear that (1,0) evaluates to 3 and (0,1) evaluates to 6, so by the IVT there is some (x,y) which gives us 4. The only question is if its obtainable with only a collection of 296 things.
This can obviously be solved in mathematica in two seconds, but I am guessing this is a homework problem so he can't do that.
First just solve for y to get y = (5-6x)/3. Now we need to choose an x so 296x is an integer, and 296y is an integer. So x = z/296 for some integer z. Thus we have y = (5-6(z/296))/3. Note that we can choose any z we like here. The only thing left to do is to find a z so that y = w/296 for some integer w.
If you follow the above logic you'll be able to figure out that we want to find an integer w = (5/3)296-2z for some integer z. In otherwords, we want a solution to the equation w+2z = (5/3)296. But this clearly has no solution, since the gcd(1,2) = 1 and (5/3)296 is not an integer, so its obviously not a multiple of 1. Therefore your problem has no solution. Why? Because if we cant choose a z that solves this equation, then there are no x which 1) solve 3x+6y+9(1-x-y) = 4 and 2) satisfy 296x is an integer.
Just a quick note: please if you don't know whether or not your problem has a solution - say so! It's kind of annoying to have to prove it for myself. I would have tried to estimate a good guess right away, but I gotta go.
What? I gave a solution. It maximizes "harmoniousness" if you define that as the MSE of the difference for each of frequencies from the "ideal" 33.3%. If you define harmoniousness differently, then you'll get a different solution.
Acro, the key to his problem is that there are only 296 things to split the frequencies among. That's like the whole point. Can you say exactly how many elements should have each frequency (of the 296 elements) according to your solution?
If we remove the restriction that we only have 296 discrete units, then obviously there are lots of elegant results like a 00/33/66 split.
I actually made a mistake in my post but I don't want to fix it right now.
On April 09 2019 07:53 Rodya wrote: Acro, the key to his problem is that there are only 296 things to split the frequencies among. That's like the whole point. Can you say exactly how many elements should have each frequency (of the 296 elements) according to your solution?
If we remove the restriction that we only have 296 discrete units, then obviously there are lots of elegant results like a 00/33/66 split.
I actually made a mistake in my post but I don't want to fix it right now.
But why? He doesn't restrict it to integers, so why should you?
He wants to assign EACH "combination" a value. There are 296 combinations, so we need to assign values to 296 things. That's the problem. If that part of the problem is removed, then your whole thing with MSE isn't necessary anyway. A 00/33/66 split is 1) way more elegant and 2) actually gives you 8/12.
He DOES restrict the problem in this way, it's in his post. If he doesn't mean to restrict it this way then the question is odd since it's so easy to solve.
Okay so; it's not a maths problem or something I was ordered to do. I'm doing it for myself. It could be that it's easy to solve but I can't because I suck, that's entirely possible (or likely).
Having integers isn't necessary, I'll have to do rounding on the results anyway even if it's integers. I can see how the way I worded the problem made you think that, sorry - again, I don't claim to not suck at this ^.^
That being said I do want a value for each group, I can't have one group at 0, so your easier solution wouldn't work for what I'm trying to get.
As per PM I guess I just have to choose one way to organize the numbers and then stick with it. I was sort of hoping that there would be one that would jump out, like 33-33-33 for a 50% frequency, but it doesn't seem to be the case with other frequencies.
I sadly still don't understand your problem. If you have 296 groups, three values, (1/4, 1/2, 3/4), and you want to assign each group one of these three values such that the average value is as close to 2/3 as possible, then I read you correctly the first time.
If I'm correct that this is your problem then I can tell you that it is impossible to get the average value to be exactly 2/3. I can try to figure out how close you can possibly get if you would like. What Acro did would be an upper bound on the accuracy of my answer though.
If you want help again post in the thread - good luck though if not.
Hm. I made a mistake above and am having the damndest time fixing it. I made a mistake in the constraints, and so the total percentage is above 100% :/
But adding that constraint makes the whole thing harder to compute because it makes it 2 variable instead of 1 (you can no longer use that the third variable is (1 - x - y), because you need to enforce that the total of x + y + z = 1. So you have a quadratic equation in two variables, which you can optimize using the ellipsoid method, but I don't have a tool handy here to do that.
Talking to Nebuchad in PM, however, it seems like his general problem has more than 3 "combinations", so the general solution requires something like that anyway.
Basically, figure out the formula you want to minimize, and formulize the constraints correctly. Then plug these into a constrained optimization tool (if you don't want the square error, but rather the simple error, you can even use simplex for this). As long as the formulas are convex (for MSE they definitely seem to be), you can use the ellipsoid method, otherwise you have a real cracker of a problem (same if you don't have linear or quadratic problems to optimize in the first place, or have non-linear constraints). Check that the solution does indeed work and you didn't make a mistake somewhere.
By Z you mean the set of integers? Sounds strange to me because Z is countable while R* is uncountable. Or are you looking "just" for an (injective?) homomorphism instead of an isomorphism?
Edit: Nevermind. I somehow missed that H isnt actually R*.
Your map isnt correct because an homomorphism must map the identity element of G to that of H, i.e. 0 ->1.
I have another question, to compute 101^(4,800,000,023) (mod 35) by hand
I honestly can't figure out how to do this one. I had a feeling, so I checked 101^10 mod 35 and saw that it is 1. So I can do 101^23 mod 35 to get my answer. I can just do that by using powers of two.
But... I only know that 101^10 mod 35 is 1 because I checked with wolfram alpha (I had a suspicion). By hand, it's not like I would just be checking every power.. Icouldn't.
I have another question, to compute 101^(4,800,000,023) (mod 35) by hand
I honestly can't figure out how to do this one. I had a feeling, so I checked 101^10 mod 35 and saw that it is 1. So I can do 101^23 mod 35 to get my answer. I can just do that by using powers of two.
But... I only know that 101^10 mod 35 is 1 because I checked with wolfram alpha (I had a suspicion). By hand, it's not like I would just be checking every power.. Icouldn't.
So how am I supposed to do this one?
ab mod n = (a mod n) (b mod n) mod n a^b mod n = ((a mod n) ^ b) mod n a^(b+c) = (a^b)(a^c) a^(bc) = (a^b)^c
Those rules are enough to do it in only few steps.
Basic idea would be to first calculate 101 mod 35, because there is no reason to ever have a number above 34 if you are calculating mod 35 anyways.
This is 31 or -4
Then you just calculate the first few powers of 31 that a bit to see when it ends up at 1. You will at most ever need the modulo number in steps due to stuff that you should have proven in group theory at some point.
You can then simply remove any multiples of this number from the exponent without problems, as they are 1s.
(I might have calculated something incorrectly here, because that means that 101^10 is not 1 (mod 35) You can then simply remove any multiples of this number from the exponent without problems.
So in your case, you can remove 4800000018 from the exponent, because that is dividable by 6
Meaning your result is 31^5 mod 35, which we already know is -9 = 26 due to the calculations we did above.
On April 17 2019 23:47 Simberto wrote: Basic idea would be to first calculate 101 mod 35, because there is no reason to ever have a number above 34 if you are calculating mod 35 anyways.
This is 31 or -4
Then you just calculate the first few powers of 31 that a bit to see when it ends up at 1. You will at most ever need the modulo number in steps due to stuff that you should have proven in group theory at some point.
You can then simply remove any multiples of this number from the exponent without problems, as they are 1s.
(I might have calculated something incorrectly here, because that means that 101^10 is not 1 (mod 35) You can then simply remove any multiples of this number from the exponent without problems.
So in your case, you can remove 4800000018 from the exponent, because that is dividable by 6
Meaning your result is 31^5 mod 35, which we already know is -9 = 26 due to the calculations we did above.
err yeah sorry, it was 101^10 mod 35 was 11 not 1
your solution is probably exactly right, ill verify it
and I think I could see that to reach the identity we would need to raise to a power of a number that divides 24 right? using euler's theorem, we can take 35 = 7*5 = order of (7-1)*(5-1) = 24
I think this is correct application?
EDIT: after going through your solution I see what you did, you used the -4 all the way through... that's really cool lol
Yes, that sounds right. You need to verify that 35 and 101 are coprime for this to work, which they are.
So you can also just remove any multiples of 24 from the exponent without even doing any work whatsoever. That still leaves the 23 as an exponent, which is equal to ^-1. So you just need to find the inverse of 31, if you want to go about it that way. But looking for inverses in modulo groups was pretty annoying and basically involved just testing all group members until you find one that works, if i recall correctly.
Yes, that sounds right. You need to verify that 35 and 101 are coprime for this to work, which they are.
So you can also just remove any multiples of 24 from the exponent without even doing any work whatsoever. That still leaves the 23 as an exponent, which is equal to ^-1. So you just need to find the inverse of 31, if you want to go about it that way.
But looking for inverses in modulo groups was pretty annoying and basically involved just testing all group members until you find one that works, if i recall correctly.
You can use the extended euclidean algorithm to do that.
In b, you need to understand what "ker(alpha)" and alpha(G) means. Those are not statements that talk about single elements of the group. the ker is everything that maps onto 1 via this homomorphism, and the image alpha(G) is everything that you can reach by starting with an element of G and using this homomorphism.
And yes, basically the easiest way to get this is to simply calculate alpha of each member of G, there are not that many in both groups anyways and alpha is not that hard to calculate. Figure out which go onto 1, and which of the members of G are actually something that can be reached via this homomorphism. Those are your two answers.
Edit Example:
1*1=1 ===> 1€ker(alpha) and 1€alpha(G) 2*2=4 ===> 2 not€ker(alpha) because it doesn't map onto 1 and 4€alpha(G) because 2 maps onto 4 (In both groups)
question for everyone: anyone around these parts have a good grasp of circumscription? (like in logic and reasoning, for an AI class).
I can't make heads or tails of it, we were given little resources with which to learn it and I find the notation in the homework to be incredibly confusing
what are the odds that in two million coin flips, i have a result of 54% or greater in favor of heads? (and specifically not tails, though that would be a follow up question. what if i wanted 54% or greater in either direction, is it just twice the probability?)
On April 25 2019 01:44 brian wrote: i have a dumb question that i couldn’t google.
what are the odds that in two million coin flips, i have a result of 54% or greater in favor of heads? (and specifically not tails, though that would be a follow up question. what if i wanted 54% or greater in either direction, is it just twice the probability?)
On April 25 2019 01:44 brian wrote: i have a dumb question that i couldn’t google.
what are the odds that in two million coin flips, i have a result of 54% or greater in favor of heads? (and specifically not tails, though that would be a follow up question. what if i wanted 54% or greater in either direction, is it just twice the probability?)
nice! but, it only goes to 100,000 trials , and i don’t understand how it would tell me the odds of getting 54% or better i guess. it’s been a decade since i’ve been into math.
oh i can see, i put in 54,000 success and take the probability of it being greater. sweet. halfway there.
Question on something I've been struggling with. If I have a line with a slope of x, and this line reflects off a mirror with a slope of y, is it possible to calculate the slope of the reflected line using basic algebra without trig?
On April 25 2019 01:44 brian wrote: i have a dumb question that i couldn’t google.
what are the odds that in two million coin flips, i have a result of 54% or greater in favor of heads? (and specifically not tails, though that would be a follow up question. what if i wanted 54% or greater in either direction, is it just twice the probability?)
For that many coin flips I think you can safely use a normal distribution because of the central limit theorem. Make a head 1 and a tail 0 so that the final number is how many heads you got.
But I think the answer will be so close to zero as to be negligible.
Btw, you are correct that it is twice the probability because in general P(A or B) = P(A) + P(B) - P(A and B) and if A is "54% or more heads" and B is "54% or more tails" then P(A and B)=0 because you cannot have both 54% or more of heads and tails. By symmetry P(A)=P(B).
On April 25 2019 02:12 enigmaticcam wrote: Question on something I've been struggling with. If I have a line with a slope of x, and this line reflects off a mirror with a slope of y, is it possible to calculate the slope of the reflected line using basic algebra without trig?
On April 25 2019 02:12 enigmaticcam wrote: Question on something I've been struggling with. If I have a line with a slope of x, and this line reflects off a mirror with a slope of y, is it possible to calculate the slope of the reflected line using basic algebra without trig?
It requires a bit of 2D geometry, but yes.
This stack exchange post, the second answer in particular, explains how: Stackexchangeislove
Remember that the 'normal vector' to the 'mirror line' has slope -(1/y).
Thank you! I'll spend some time reading through the links you posted. My linear algebra is limited, so will take me some time to understand what they're saying.
On April 25 2019 04:46 enigmaticcam wrote: Thank you! I'll spend some time reading through the links you posted. My linear algebra is limited, so will take me some time to understand what they're saying.
For considering the vectors involved, it is easier to imagine that your x slope and the mirror slope cross at the origin (slopes are invariant under translation). If you only care about calculating the slope after reflection, this will make everything muuuuch easier to calculate.
It's that time of year where my procrastination catches up with me and it starts getting near finals and I realize I don't understand anything about any of my classes. So this thread will probably get flooded with questions from yours truly.
Let's start it off with some question from my crypto class.
It says, Determine the points on the elliptic curve E : y^2 = x^3 + 2x + 1 over Z_11 How many points are on this curve
So, I have never quite done something like this before. Does this mean that both y, and x need to be in Z_11? And thus, if x^3 + 2x + 1 is not a perfect square, then the corresponding point will not be on the curve...? So do I need to simply solve for y^2 for x=0 to 10, and only keep the ones that are perfect squares and then solve for y? Furthermore, when I solve for y^2, do I do modulus to each x term, or do I only do modulus to the entire equation?
I am not certain of the details here, but a few things i can answer.
Usually if you say "a curve over Z_11", you mean that both y and x are €Z_11
Furthermore, when I solve for y^2, do I do modulus to each x term, or do I only do modulus to the entire equation?
Doesn't matter, both are equivalent. So you should probably modulus whenever you get a number larger than 10. a(mod11) + b(mod11) = (a+b) (mod11)
And my guess would indeed be to simply get all the possible solutions for both sides of the equations (Try 0-10), and see which numbers are both. But there might be another, smarter way to do this.
On April 28 2019 08:42 travis wrote: It's that time of year where my procrastination catches up with me and it starts getting near finals and I realize I don't understand anything about any of my classes. So this thread will probably get flooded with questions from yours truly.
Let's start it off with some question from my crypto class.
It says, Determine the points on the elliptic curve E : y^2 = x^3 + 2x + 1 over Z_11 How many points are on this curve
So, I have never quite done something like this before. Does this mean that both y, and x need to be in Z_11?
Yes. You are solving the given algebraic equation, but the algebraic structure you are working with is Z_11 instead of the real or complex numbers. Luckily, Z_11 is a field so it shouldn't be too annoying to solve the equation.
And thus, if x^3 + 2x + 1 is not a perfect square, then the corresponding point will not be on the curve...?
A perfect square in Z_11, to be precise (which may or may not correspond to perfect squares in the reals.) But yes, that is correct.
So do I need to simply solve for y^2 for x=0 to 10, and only keep the ones that are perfect squares and then solve for y?
Yep! There might be an quicker way to solve it (I can't think of any off the top of my head), but that will certainly get the job done.
Furthermore, when I solve for y^2, do I do modulus to each x term, or do I only do modulus to the entire equation? Thanks!
It shouldn't matter if you take the modulus for each term individually or over the entire equation because modular arithmetic is defined by using representatives: [x]+[y] = [x+y] and [x]*[y] = [x*y]
here is another problem from crypto class. maybe you guys can just help me understand what the question is actually asking. it says
Can the following be solved in polynomial time? Given a prime p, a value x in Z*_(p-1), and y := g^x (mod p) where g is a uniform value from Z*_(p-1), find g, ie, compute y^(1/x) (mod p).
I am not really understanding this problem because it sounds like we are given p, x, y. So then we can compute the answer in constant time.... by just plugging in the numbers... What is it actually asking? lol
Have you tried plugging in the numbers? Is it really trivial?
Just take some random 3 digit prime, and pick x,y somehow randomly. Let's just say p, = 197, x = 146, y = 123. If you can just plug in the numbers, this should take you maximum 20 seconds to solve. (I didn't check the solution.)
ok so constant time was hyperbole and no im not actually sure but i def do think this is polynomial time solvable so i guess i will get to work
you are saying that i likely am interpreting the problem correctly then,. though?
edit: so, as an actual answer to the question, can't we literally just check every prime g, and then do g^x mod p? it's awful but that would be polynomial.. But maybe the assumption is we don't know how big our set of primes is, and we don't have a list of primes
edit2: ok now i think it isn't always solvable in polynomial time and is analogous to the discrete logarithm problem
On April 29 2019 07:39 travis wrote: you are saying that i likely am interpreting the problem correctly then,. though?
I was pointing you at the fact, that your initial interpretation doesn't survive any random example. (and that it would probably not asked too much to at least try it once...)
On April 29 2019 07:39 travis wrote: ok so constant time was hyperbole and no im not actually sure but i def do think this is polynomial time solvable so i guess i will get to work
you are saying that i likely am interpreting the problem correctly then,. though?
edit: so, as an actual answer to the question, can't we literally just check every prime g, and then do g^x mod p? it's awful but that would be polynomial.. But maybe the assumption is we don't know how big our set of primes is, and we don't have a list of primes
edit2: ok now i think it isn't always solvable in polynomial time and is analogous to the discrete logarithm problem
Seems to me you can do this in polynomial time and your initial edit is correct. You know p, y and x. So even if there is no constant time algorithm for finding the root in modular arithmetic, you can definitely do it in polynomial time:
Just run through all values i \in Z*_(p-1) and see if i^x = y (mod p). The check is constant time, so your algorithm overall is O(p).
On April 29 2019 07:39 travis wrote: ok so constant time was hyperbole and no im not actually sure but i def do think this is polynomial time solvable so i guess i will get to work
you are saying that i likely am interpreting the problem correctly then,. though?
edit: so, as an actual answer to the question, can't we literally just check every prime g, and then do g^x mod p? it's awful but that would be polynomial.. But maybe the assumption is we don't know how big our set of primes is, and we don't have a list of primes
edit2: ok now i think it isn't always solvable in polynomial time and is analogous to the discrete logarithm problem
Seems to me you can do this in polynomial time and your initial edit is correct. You know p, y and x. So even if there is no constant time algorithm for finding the root in modular arithmetic, you can definitely do it in polynomial time:
Just run through all values i \in Z*_(p-1) and see if i^x = y (mod p). The check is constant time, so your algorithm overall is O(p).
So why the edit2?
Honestly it didn't seem right to just say "check every prime", im worried this isn't what they want as the solution.... but I suppose it is polynomial to verify a value is prime as well, so it's still polynomial, so i'll go for it
On April 29 2019 07:39 travis wrote: you are saying that i likely am interpreting the problem correctly then,. though?
I was pointing you at the fact, that your initial interpretation doesn't survive any random example. (and that it would probably not asked too much to at least try it once...)
Well in fairness, one would need to come up with an example that has primes and an integer solution, which your random example didn't. but your point was valid anyways
I am very fuzzy on this, since it has been a while since i did stuff like that, but i seem to recall that there was a lot of maths going on with finding roots in finite fields. Do you have any notes that you could go over to see if you did anything like that?
Nope, in fact we've spent extraordinarily little time (in this class) on actual math. just some slides on euler's theorem and the chinese remainder theorem.
On April 29 2019 07:39 travis wrote: ok so constant time was hyperbole and no im not actually sure but i def do think this is polynomial time solvable so i guess i will get to work
you are saying that i likely am interpreting the problem correctly then,. though?
edit: so, as an actual answer to the question, can't we literally just check every prime g, and then do g^x mod p? it's awful but that would be polynomial.. But maybe the assumption is we don't know how big our set of primes is, and we don't have a list of primes
edit2: ok now i think it isn't always solvable in polynomial time and is analogous to the discrete logarithm problem
Seems to me you can do this in polynomial time and your initial edit is correct. You know p, y and x. So even if there is no constant time algorithm for finding the root in modular arithmetic, you can definitely do it in polynomial time:
Just run through all values i \in Z*_(p-1) and see if i^x = y (mod p). The check is constant time, so your algorithm overall is O(p).
So why the edit2?
Honestly it didn't seem right to just say "check every prime", im worried this isn't what they want as the solution.... but I suppose it is polynomial to verify a value is prime as well, so it's still polynomial, so i'll go for it
On April 29 2019 07:39 travis wrote: you are saying that i likely am interpreting the problem correctly then,. though?
I was pointing you at the fact, that your initial interpretation doesn't survive any random example. (and that it would probably not asked too much to at least try it once...)
Well in fairness, one would need to come up with an example that has primes and an integer solution, which your random example didn't. but your point was valid anyways
Maybe I'm missing something, but you don't have to check primes? You're given the prime, right?
So lets say you are given: p=29 y=6 x=2
Now the assignment is to find g, with g^2 = 6 (mod 29).
Right?
So we don't care about primes: we have the one we need. We just need to find g. So lets define f(x) = x^2 (mod 29) we start:
i=1 f(i) = 1 i = 2 f(i) = 4 i = 3 f(i) = 9 ... i = 8 f(i) = 6. Hurrah!
Worst case, you get to i = p - 1 (in this case 28) before you find g. So complexity class is O(p).
E: btw, I am not claiming this is the fastest algorithm to do it. There may actually be faster ones: my modular arithmetic is very rusty. I'm just showing an upper bound of linear time. There may be a constant time algorithm for finding the modular root and I just don't know it.
ah right... i was misunderstanding the question AND overcomplicating it anyways. so, our value g was generated from Z*_p, but that doesn't mean it has to be prime. And it wouldn't have mattered anyways because I can check non-prime integers anyways.
On April 29 2019 07:39 travis wrote: ok so constant time was hyperbole and no im not actually sure but i def do think this is polynomial time solvable so i guess i will get to work
you are saying that i likely am interpreting the problem correctly then,. though?
edit: so, as an actual answer to the question, can't we literally just check every prime g, and then do g^x mod p? it's awful but that would be polynomial.. But maybe the assumption is we don't know how big our set of primes is, and we don't have a list of primes
edit2: ok now i think it isn't always solvable in polynomial time and is analogous to the discrete logarithm problem
Seems to me you can do this in polynomial time and your initial edit is correct. You know p, y and x. So even if there is no constant time algorithm for finding the root in modular arithmetic, you can definitely do it in polynomial time:
Just run through all values i \in Z*_(p-1) and see if i^x = y (mod p). The check is constant time, so your algorithm overall is O(p).
So why the edit2?
Honestly it didn't seem right to just say "check every prime", im worried this isn't what they want as the solution.... but I suppose it is polynomial to verify a value is prime as well, so it's still polynomial, so i'll go for it
On April 29 2019 14:19 mahrgell wrote:
On April 29 2019 07:39 travis wrote: you are saying that i likely am interpreting the problem correctly then,. though?
I was pointing you at the fact, that your initial interpretation doesn't survive any random example. (and that it would probably not asked too much to at least try it once...)
Well in fairness, one would need to come up with an example that has primes and an integer solution, which your random example didn't. but your point was valid anyways
Maybe I'm missing something, but you don't have to check primes? You're given the prime, right?
So lets say you are given: p=29 y=6 x=2
Now the assignment is to find g, with g^2 = 6 (mod 29).
Right?
So we don't care about primes: we have the one we need. We just need to find g. So lets define f(x) = x^2 (mod 29) we start:
i=1 f(i) = 1 i = 2 f(i) = 4 i = 3 f(i) = 9 ... i = 8 f(i) = 6. Hurrah!
Worst case, you get to i = p - 1 (in this case 28) before you find g. So complexity class is O(p).
E: btw, I am not claiming this is the fastest algorithm to do it. There may actually be faster ones: my modular arithmetic is very rusty. I'm just showing an upper bound of linear time. There may be a constant time algorithm for finding the modular root and I just don't know it.
You should take the complexity of the modular exponentiation (f(x)) into account, but of course it remains polynomial.
alright, im a bit behind in my abstract algebra class and now im lost as heck
so here comes a wave of questions. please help me understand how to do this stuff
first question says:
let n >= 5. It can be shown that the only normal subgroups of S_n are {(1)}, A_n, and S_n a.) for each normal subgroup N of S_n above, describe what the quotient group S_n / N is isomorphic to.
So first, the extent of my understanding:
A normal subgroup H of S_n is a subgroup of S_n in which gH = Hg (left cosets and right cosets are the same) for g in S_n
A quotient group is a normal subgroup... but that's about as far as I understand it
I see a rule (aN)(bN) = (abN)
Is it the case here that a,b are cosets of G, and then the quotient group is abN which arises from the product of the group operation between a and b?
So then back to the question of {(1)}, A_n, and S_n in S_n I don't really know how to solve what these quotient groups are isomorphic to What kind of steps do I take?
I do have guesses though, I think for (1) the answer is S_n, for A_n the answer is Z_2, and for S_n the answer is the identity
let n >= 5. It can be shown that the only normal subgroups of S_n are {(1)}, A_n, and S_n a.) for each normal subgroup N of S_n above, describe what the quotient group S_n / N is isomorphic to.
So first, the extent of my understanding:
A normal subgroup H of S_n is a subgroup of S_n in which gH = Hg (left cosets and right cosets are the same) for g in S_n
A quotient group is a normal subgroup... but that's about as far as I understand it
I see a rule (aN)(bN) = (abN)
Is it the case here that a,b are cosets of G, and then the quotient group is abN which arises from the product of the group operation between a and b?
So then back to the question of {(1)}, A_n, and S_n in S_n I don't really know how to solve what these quotient groups are isomorphic to What kind of steps do I take?
I do have guesses though, I think for (1) the answer is S_n, for A_n the answer is Z_2, and for S_n the answer is the identity
It's been a while, but here's what I recall: -Iirc there is a rather trivial way to answer this question by calculating the number of elements of S_n / N, though this does not really provide any further understanding. -I don think it makes sense to say that a "quotient subgroups is a normal subgroup". I mean, a quotient G/H generally isnt even a subset of G (though often it can be identified with one, but I would not expect that this identification automatically come with a group structure).
let n >= 5. It can be shown that the only normal subgroups of S_n are {(1)}, A_n, and S_n a.) for each normal subgroup N of S_n above, describe what the quotient group S_n / N is isomorphic to.
So first, the extent of my understanding:
A normal subgroup H of S_n is a subgroup of S_n in which gH = Hg (left cosets and right cosets are the same) for g in S_n
A quotient group is a normal subgroup... but that's about as far as I understand it
I see a rule (aN)(bN) = (abN)
Is it the case here that a,b are cosets of G, and then the quotient group is abN which arises from the product of the group operation between a and b?
So then back to the question of {(1)}, A_n, and S_n in S_n I don't really know how to solve what these quotient groups are isomorphic to What kind of steps do I take?
I do have guesses though, I think for (1) the answer is S_n, for A_n the answer is Z_2, and for S_n the answer is the identity
Not quite sure on the algebra, but any question of "prove that for all n > something..." invites a proof through induction, so prove it for n=5 and then assume for n and prove for n+1.
On May 01 2019 05:20 travis wrote: alright, im a bit behind in my abstract algebra class and now im lost as heck
so here comes a wave of questions. please help me understand how to do this stuff
first question says:
let n >= 5. It can be shown that the only normal subgroups of S_n are {(1)}, A_n, and S_n a.) for each normal subgroup N of S_n above, describe what the quotient group S_n / N is isomorphic to.
So first, the extent of my understanding:
A normal subgroup H of S_n is a subgroup of S_n in which gH = Hg (left cosets and right cosets are the same) for g in S_n
A quotient group is a normal subgroup... but that's about as far as I understand it
I see a rule (aN)(bN) = (abN)
Is it the case here that a,b are cosets of G, and then the quotient group is abN which arises from the product of the group operation between a and b?
So then back to the question of {(1)}, A_n, and S_n in S_n I don't really know how to solve what these quotient groups are isomorphic to What kind of steps do I take?
I do have guesses though, I think for (1) the answer is S_n, for A_n the answer is Z_2, and for S_n the answer is the identity
Not quite sure on the algebra, but any question of "prove that for all n > something..." invites a proof through induction, so prove it for n=5 and then assume for n and prove for n+1.
I don't think you need that here.
With regards to the original question:
Quotient groups are groups where you simply chuck all elements of a larger group which are equal (in some way) together as if they are one.
As an example, if you take a look at the quotient group Z/3Z. This is the same group as Z, if you consider all elements which are equal modulo 3 as equal. As such, the only things in there are the equivalence classes of 0; 1 and 2. All other elements of Z are equivalent to one of these. Or formulated in another way: Which different types of elements are there in Z with regards to the result of z + 3Z.
Luckily, the statement about the normal subgroups of S_n does not need to be proven here, but can instead be taken as a given.
And there is a very simple solution involving the amount of things in each of these groups, as there is a theorem which states that |G| = |N|*|G/N|
So the answer which was given by travis is indeed correct.
So then is the way this type of question would typically be solved (assuming you are a confused student and not a math god) to simply fill in the |G| = |N|*|G/N| and then base a conclusion on it?
Next two questions continued from that one:
suppose α: S_n --> H is a group homomorphism (i believe we are still dealing with n>=5). Describe the possibilities for what the image of α(S_n) could be isomorphic to.
Should this answer be formed from the theorem you just told me of |G| = |N|*|G/N| ?
Show that a homomorphism α : S_n ---> Z_5 must be the trivial one: α(j) = 0 for all j in S_n
I am very confused here Is it saying that this homomorphism exists for all n >= 5?
On May 01 2019 06:53 travis wrote: So then is the way this type of question would typically be solved (assuming you are a confused student and not a math god) to simply fill in the |G| = |N|*|G/N| and then base a conclusion on it?
It is one tool, but often not enough for a proof, though. If the answer is that |G/N| = 1;2 (maybe 3? unsure) or |G|, then your answer is clear. But if it is something like 4, then this alone doesn't prove what the group is isomorphic to.
Generally speaking, you should try to understand what exactly equivalence classes are. And then figure out which different ones of those you have. This can be as simple as calculating g*N for each member g of G, and looking at how many different results you have. If we go back to the Z/3Z example, we notice that 0 + 3Z is the same as 3 + 3Z or 6 + 3Z (+ instead of * because that is the operation of the Z group). The only fundamentally different results are 0 + 3Z, 1 + 3Z and 2 + 3Z. These are your equivalency classes, and that means that Z/3Z is isomorphic to the group {0;1;2} with the group operation (g+h (mod 3)), the group you call Z_3.
Here, the above calculation would not have helped either, because neither Z nor 3Z are finite.
suppose α: S_n --> H is a group homomorphism (i believe we are still dealing with n>=5). Describe the possibilities for what the image of α(S_n) could be isomorphic to.
Should this answer be formed from the theorem you just told me of |G| = |N|*|G/N| ?
Is H a normal subgroup? Or just any group? Anyways, because you are not talking about quotient groups, i don't think you need that equation. My first guess would be to take a look at the definition of homomorphism and try to see where that leads you. The whole A_n and sgn ideas might also help. (I am not leading you along here, i also don't immediately know the answer right now). {1} and S_n are definitively possible results, though, because they always are. And Z_2 due to the sgn homomorphism.
Nvm, thought about it a bit, and there was a theorem for this, the first isomorphism theorem links images of homomorphisms and quotient groups. This is clearly what you need to use here.
Second question is probably quickly answered using the result of the previous one. If you see that Z_5 and none of its subgroups except for the trivial one (of which there are none) can be isomorphic to the image of a homomorphism α: S_n ---> H, then the only possible homomorphism from S_n to Z_5 is the trivial one.
Edit: Your addition: The trivial homomorphism always exists between any two groups. It simply throws anything in group one onto unity in group two. What you need to show is not that this exists, but that no other exists. And that is true for all N >=5.
Unrelated edit 2: I will have to freshen up on my knowledge of algebra for a major exam in about half a year. I was thinking about doing this by explaining this stuff in a thread here on TL, possibly giving some worksheets to interested people, as i find that i learn very well by teaching stuff. Would people be interested in this? And what would be the best way to do this, probably a dedicated thread?
if its abstract algebra and it's before my final (about 2 weeks from now), i'd definitely be interested wil read the rest of the post later, am gonna go eat my chipotle
Okay, I read the first isomorphism theorem and it's the most confusing thing I have ever read.
I guess I need to remember what the kernel of a homomorphism is. The kernel describes the elements of G (in this case, S_n), which are mapped to the identity in H. That's confusing. Is that mapping ever not just a mapping between the identity in G and the identity in H?
So once I know the kernel, we can say the kernel is normal in G (S_n). ... not sure what that means. The kernel itself is a normal subgroup?
The theorem then says if we let q be a mapping from G--->G/K (where K is the kernel of G), then q is the canonical homomorphism. I have no idea what that means though.
It then says that there then exists a unique ismorphism G/K----> α(G) where α describes the original homomorphism?
I am a bit disheartened at my inability to understand this, lol
On May 01 2019 09:05 travis wrote: Okay, I read the first isomorphism theorem and it's the most confusing thing I have ever read.
I guess I need to remember what the kernel of a homomorphism is. The kernel describes the elements of G (in this case, S_n), which are mapped to the identity in H. That's confusing. Is that mapping ever not just a mapping between the identity in G and the identity in H?
Maybe it is a language problem, so just to avoid any confusion: The kernel of an homomorphism f: G -> H is the preimage of the identity element 1_H with respect to f. So it always contains 1_G, but it may also contain other elements.
So once I know the kernel, we can say the kernel is normal in G (S_n). ... not sure what that means. The kernel itself is a normal subgroup?
Yes. This is a really a key statement in algebra.
The theorem then says if we let q be a mapping from G--->G/K (where K is the kernel of G), then q is the canonical homomorphism. I have no idea what that means though.
You should check this again because right now, you are missing some assumptions of the theorem, or at least you have not stated them here explicitly. You can only speak of a kernel if there is a homomorphism to which this kernel belongs.
Also the way you say this isnt precise. There is the canonical homomorphism q: G -> G/K, by mapping g to its equivalence class. However, there could also be other homomorphisms G -> G/K (for example, the trivial homomorphism which maps everything to the identity element). So no, it is not true to say that "q is a mapping G -> G/K, then q is the canonical homomorphism".
It then says that there then exists a unique ismorphism G/K----> α(G) where α describes the original homomorphism?
I am a bit disheartened at my inability to understand this, lol
So basically you want K to be the kernel of α (?). And iirc no, while there is an isomorphism d: G/K -> α(G), the uniqueness is only guaranteed if you require that d q = α
Namely, in this case the interesting fact is that if α is a homomorphism from G to H (Or in this case S_n to H), this means that the image of α is isomorphic to the quotient group S_n/ker(α).
Furthermore, ker(α) is a normal subgroup.
With the answers to the first question, we know that there are not a lot of normal subgroups in S_n, and that there are thus not a lot of those quotient groups. Namely, the only possible quotient groups are {1}, S_n and Z/2Z. These are also the only possible images of alpha due to what the theorem says. (Obviously depending on alpha itself, and on the group it goes to)
If you want to go into more detail, depending on H, {1} is always possible, Z/2Z only possible if H has an element of order 2 (And thus a subgroup isomorphic to Z/2Z), and S_n is only possible if H has a subgroup isomorphic to S_n.
It looks as though you are being walked through a proof without being told you are proving something. I don't like that way of teaching maths. I prefer motivating things first.
Sometimes we want to move from one group G into another group G', so we want a map α:G-->G'. For this to be useful as groups we want the map α to preserve group structure; ie we want α(gh)=α(g)α(h) for all g,h in G. We call such maps homomorphisms.
The question is: What can we say about α(G) in terms of G?
Firstly, some elements of G may get mapped to the identity 1' in G'. We call the set of such elements the kernel of α. We denote it ker(α) and this is a subgroup of G. This follows directly from the definition of subgroup, together with the homomorphism property. Lets use K to denote the kernel for short.
Now we know that α(k)=1' in G' for all k in the kernel (by definition of kernel) so for any g in G, α(gK)=α(g) so to understand α(G) we only need to look at cosets of K.
So lets have a brief aside about cosets: Let U be a subgroup of G (I'm only using U because it is a new letter). We let G/U denote the set of left cosets of U in G, ie things of the form gU. There are two questions we could ask: 1) How many cosets are there? 2) When is G/U a group?
(1) The crucial fact about cosets is that they are either disjoint or the same, ie either gU=hU or they have no intersection at all. This is because if there is some intersection then gu=hv for some u,v in U and then gU = hvu^-1U = hU since vu^-1 is in U
Since different cosets cannot overlap we know |G| = Number of cosets x Size of a coset = |G/U| x |U|
(2) The most obvious way to make G/U a group is to say (gU)(hU)=(gh)U but this does not always give us a group. We would need (gU)(g^-1U)=U for inverses to work. So if U satisfies gUg^-1=U for all g in G then we call U a normal subgroup of G. It is equivalent to say left cosets are right cosets because gUg^-1=U if and only if gU=Ug and this condition is enough to make G/U a group, which we would then call the quotient group. This quotient group is not the same as U.
Now we can write down a homomorphism G-->G/U by sending g to its coset gU. We have to check that it is a homomorphism, ie that gUhU=(gh)U but this follows from U being a normal subgroup. This is what people meant by the canonical homomorphism.
Anyhoo, let's get back to our map α:G-->G' and its kernel K. We can quickly check that for all g in G we have gKg^-1 = K α(gkg^-1) = α(g)α(k)α(g^1) = α(g)α(g)^-1 = 1' in G'
so K is a normal subgroup of G and the set G/K of cosets is a group. We can actually write down a homomorphism A:G/K-->α(G) by saying gK goes to α(g). We need to check that this is well defined, by which I mean that if gK=hK then α(g)=α(h) and this follows from definitions; If gK = hK then gKh^-1 = K and so 1' = α(K) = α(gKh^-1) = α(g)α(K)α(h^-1) = α(g)α(h)^-1.
Finally, we can prove A is an isomorphism. It is injective because if A(gK) is the identity then α(g)=1' so g is in K and gK=K. It is surjective onto α(G) because if g' is in α(G) then there is some g in G such that α(g)=g' and then A(gK)=α(g)=g'.
This proves that isomorphism theorem; G / kernel(α) is isomorphic to α(G).
In summary, there are two important things to remember (besides definitions) (a) For any normal subgroup N of G |G| = |N| x |G/N| (b) For any homomorphism α:G-->G' kernel(α) is a normal subgroup of G and G/kernel(α) is isomorphic to α(G).
Onto the specifics about S_n We are told that it has only three normal subgroups {1}, S_n, and A_n. You say the quotients are S_n, {1}, and Z_2 respectively (which is correct and can be proven using fact (a) assuming you know the size or index of A_n in S_n).
So lets assume we have a homomorphism α: S_n ---> Z_5. There always is a homomorphism because I can send everything to the identity. Are there any other homomorphisms though? You know what the quotient groups of S_n are and using fact (b) you should be able to say what the image of α could be.
I hope this helps If not, please ask for more clarification/explanation.
just took a practice exam, there were a few questions I missed or just wanted some clarification on:
first was T/F question
T/F: every infinite cyclic group is isomorphic to Z
This question actually comes up again in a different way. I guessed True and got it, but I don't understand why. Why is Z cyclic, could someone explain? Is it because 1 is considered a generator of the infinite group Z?
Give an example of an abelian group with 12 elements that is not cyclic
The solution sheet said z2 x z4 I said A4
is A4 also a solution?
Give a reason why Z is not isomorphic Q
I didn't really know how to answer this. Their answer was that Z is cyclic and Q is not. See the question above I guess.
We have a function f: ZxZ ---> Q*, s.t. f(a,b) = 2^a3^b
What is the kernel of f?
The answer here is that the kernel is (0,0) I am still having trouble with the kernel of a homomorphism.
Is it because identity in Q is 1, and we get 1 by doing a=0,b=0?
We have a function f: ZxZ ---> Q*, s.t. f(a,b) = 2^a3^b
What is the kernel of f?
The answer here is that the kernel is (0,0) I am still having trouble with the kernel of a homomorphism.
Is it because identity in Q is 1, and we get 1 by doing a=0,b=0?
In an attempt to clear up your confusion with the kernel, the kernel of a homomorphism (α:G -> H) always contains the identity element of G (which can be derived from the definition of a homomorphism: α(ab) = α(a)α(b)). However, the kernel only contains a single element (the identity of G) when α is a monomorphism (a homomorphism that is injective.) Otherwise, the kernel will have multiple elements.
As for your question, that is correct. The proof is because 2 and 3 are coprime, so the product of a given number of 2's and 3's exists and is unique. Thus, α is injective and the only element of the kernel is the identity element of ZxZ, (0,0).
If we instead consider a function g: ZxZ --> Q*, where g(a,b) = 2^a4^b, then the kernel of f has infinite elements: all (a,b) that satisfy the equation a = -2b are in the kernel of g. We may readily observe this by substituting a = -2b into g(a,b) = 2^(-2b)4^b = 4^(-b)4^b = 1. We see that g is not injective so the kernel has multiple elements (the set of a and b given above.)
Sadly, I don't know enough about groups to help with your other questions.
T/F: every infinite cyclic group is isomorphic to Z
This question actually comes up again in a different way. I guessed True and got it, but I don't understand why. Why is Z cyclic, could someone explain? Is it because 1 is considered a generator of the infinite group Z?
Cyclic groups have just one generator (an element that, along with it's inverse, can create every element of the group). You've found it for the infinite group Z.
But this true false question is asking a little bit more. Suppose I had another group Z' and all I told you about it was that it was infinite, and it was cyclic. The question wants us to show that my secret group Z' is actually isomorphic to Z. Isomorphic can be shown in a few ways, though they're basically equivalent to showing that there exists a pair of homomorphisms: f: Z->Z' g: Z'->Z
Where each is injective. Now, tbh I didn't have a great understanding of what a homomorphism was when I took algebra. It's a really general term for "function that preserves structure." If you like Computer Science, it may help to think of it as a compiler. A homorphism from A to B let's you simulate all of A in the "language" of B, and it's injective if that simulation never uses the same symbol for 2 different things. A and B are isomorphic if they can simulate each other without ambiguity, so that they're equivalent in power.
Anyway, in this case the homomorphisms are short and sweet, pick a generator g for Z' (we know one exists because cyclic), and map it to 1 in Z. If some element is n*g (ie, g+g+... repeated n times) then map that to n*1 in Z. Likewise for the other direction, map n=n*1 to n*g in Z'.
To show each of these is injective, note that these are both infinite cyclic groups, so that if n*g = m*g then n=m (otherwise, my secret group Z' would loop around after n-m additions, but that means it's actually finite).
Give an example of an abelian group with 12 elements that is not cyclic
The solution sheet said z2 x z4 I said A4
is A4 also a solution?
Forgive my bad memory, but is A4 abelian? I can't remember if A4 or A5 was the first bad one. A4 does have size 12, so you're good there.
Give a reason why Z is not isomorphic Q
I didn't really know how to answer this. Their answer was that Z is cyclic and Q is not. See the question above I guess.
So going back to our simulation analogy, how do we simulate Z using Q (Q with addition as the operation I'm guessing?) ? Well we can just use the inclusion, send n in Z to n in Q and everything works perfectly.
What about simulating Q with Z? Everything goes wrong pretty fast. If we pick some a/b in Q to send to 1 then we can't fit a/(2b), by which I mean that if our homomorphism is f then 1 = f(a/b) = f(a/(2b) + a/(2b)) = 2 f(a/(2b)) So f(a/2b) needs to equal 1/2... but 1/2 isn't in Z.
There are other problems. For example, Q needs infinitely many generators (as integer multiples of any fraction (not 0) will never include half itself, this is basically the problem above), while Z has only 1. This is the observation made in the answer key.
We have a function f: ZxZ ---> Q*, s.t. f(a,b) = 2^a3^b
What is the kernel of f?
The answer here is that the kernel is (0,0) I am still having trouble with the kernel of a homomorphism.
Is it because identity in Q is 1, and we get 1 by doing a=0,b=0?
Thanks for any help
Yet again mentioning the simulation analogy. if it helps you can think of the kernel as the details a simulation ignores. In this problem, we want to keep track of pairs of integers (Z x Z) with their additive structure (so, for example, (1,1) + (3,-2) =(4, -1) ). The problem proposes one way to do this inside of Q*, via (a,b) mapsto f(a,b)=2^a * 3^b.
To explicitly find the kernel, we just solve f(a,b) = 1 (the identity in the simulating group). Why does this work? Well if some (c,d) is in the kernel (gets mapped to 1) then f( (a,b) + (c,d) ) = f(a,b)*1 = f(a,b). So f always ignores copies of (c,d).
If there isn't anything besides the identity in the kernel then f is injective. As f(x,y) = f(a,b) -> f(x,y)f(a,b)^-1 = 1 -> f((x,y)-(a,b))=1 So that (x,y)-(a,b) is in the kernel, but that means (x,y)=(a,b).
Just to make sure that you understand what a kernel is:
The kernel of alpha(G--->H) is simply everything that ends up on unity in H.
For example, if i have the homomorphism Z ----> Z/3Z with f(z) = z (mod 3)
You can easily check that this is a homomorphism by using the homomorphism definition if you want to. The kernel of this homomophism is {...-6;-3;0;3;6;9;...}, because these are all of the elements in Z that end up on 0 in Z/Z3
So I've decided I need to refresh my math knowledge. And I want to as well I'm a programmer, been working in web development (PHP, C#), but I'm making a switch to A.I./machine learning. Studying that + python, and working on Project Euler exercises for now. I studied some math in University (Masters degree in IT), but I have never had much application for it at work. And I've never considered myself good at math. So I feel really rusty.
Could anyone recommend some good books/resources? I really don't have more specific requirements for now. I just want another challenge for myself. I could use google, but I wonder if people from here have recommendations I'd really appreciate the help.
So I've decided I need to refresh my math knowledge. And I want to as well I'm a programmer, been working in web development (PHP, C#), but I'm making a switch to A.I./machine learning. Studying that + python, and working on Project Euler exercises for now. I studied some math in University (Masters degree in IT), but I have never had much application for it at work. And I've never considered myself good at math. So I feel really rusty.
Could anyone recommend some good books/resources? I really don't have more specific requirements for now. I just want another challenge for myself. I could use google, but I wonder if people from here have recommendations I'd really appreciate the help.
You're probably gonna want a good bit of stats and optimization for ML. The basics behind this are, at the end of the day, analysis (particularly calculus) and linear algebra. The Calculus texts by Michael Spivak or Tom Apostol are good here if you are okay to dive into a serious undergrad maths book. Apostol also seems to cover some of the algebra.
alright practicing for my final, going over old midterms
here is a question I had trouble with:
find the last 2 digits of 503^163
My method that I gave up on because it was taking too long to be feasible for a midterm:
[spoiler] 1.)Notice that the last two digits here are 03 and the 5 should never matter in regards to the last two digits 2.)use the repeated square method to find 3^2, 3^4, 3^etc all the way to 3^128 3.)take products of the squares such that they sum to 163 and look at last two digits [spoiler]
1.) notice that 503 is congruent to 3 mod 100 (i guess pretty much what I said for my step 1) 2.) compute phi(100) = phi(2^2)*phi(5^2) = (4-2)(25-5) = 40 3.) notice 163 is congruent to 3 (mod 40) ..... //they sure expect us to notice lots of stuff 4.) 503^163 congruent to 3^163 congruent to 3^3 congruent to 27 (mod 100)
How would you solve it if given this question also, if we are using the expected method, could anyone give me some intuition on why the second method works? What theorem is this tied to? Is this Euler's?
You should always read "Find the last two digits" as "calculate stuff mod 100".
The described method works, because the order of Z/100Z (as a ring) is phi(100). This means that the order of any element of Z/100Z is a factor of 40 and thus, g^40 = 1 for any member of Z/100Z.
You could also simply try to find the order of 3 (mod 100) by calculating 3 9 27 81=-19 -57=43 121=21 63 189=89=-11 -33 -99=1
and thus derive that the order of 3 mod 100 is actually 10, and that you can thus remove any multiples of 10 from the exponent in question, once again leading you to 3^3 as your result. Takes a bit longer, but not that much longer, and takes less stuff you need to realize. This is probably how i would have done it.
Didn't we have a similar question a few pages back?
On May 18 2019 02:19 Simberto wrote: You should always read "Find the last two digits" as "calculate stuff mod 100".
The described method works, because the order of Z/100Z (as a ring) is phi(100). This means that the order of any element of Z/100Z is a factor of 40 and thus, g^40 = 1 for any member of Z/100Z.
You could also simply try to find the order of 3 (mod 100) by calculating 3 9 27 81=-19 -57=43 121=21 63 189=89=-11 -33 -99=1
and thus derive that the order of 3 mod 100 is actually 10, and that you can thus remove any multiples of 10 from the exponent in question, once again leading you to 3^3 as your result. Takes a bit longer, but not that much longer, and takes less stuff you need to realize. This is probably how i would have done it.
Didn't we have a similar question a few pages back?
I am sure we did. I appreciate everyone's patience with me in this thread, since it's mostly me asking questions here I realize I am just absorbing everyone's knowledge and don't really have an opportunity to contribute much. Maybe now that I am back on ADD medication I'll be able to learn this stuff a bit quicker.
ok so, This means that the order of any element of Z/100Z is a factor of 40 and thus, g^40 = 1 for any member of Z/100Z seems to be the crucial part. I am sure this will be on the exam, thank you.
I actually like your method too, though. I remember that being done in this thread, and it's pretty straightforward if I can remember to use the negative modulus
No, it is not. Any of these permutations are their own inverse in the group and thus the group is not cyclic.
Is my answer accurate? Would i get full credit for that?
Your answer is correct, but if you would get full credit relies on how explicitly you have established some of those things in class.
I would go a bit more into detail here, and basically find two different objects of order 2 (180° Rotation on x and y axis, for example), and then explain why there can not be more than one object with an order of 2 in a cyclical group. Thus, you have an actual proof.
I actually like your method too, though. I remember that being done in this thread, and it's pretty straightforward if I can remember to use the negative modulus
The problem with my method is that it doesn't scale well at some point. If in another situation you have an object of order 800 or so, you would have to do 800 calculations to discover this. It works well for kinda small groups, but at some point it can be really annoying, and other ways to quickly discover the result can be helpful.
I have two finite line segments on a cartesian plane. The line segments are defined by their endpoints ((x1,y1),(x2,y2)) and ((x3,y3),(x4,y4)).
The lines are not exactly the same, but they can intersect, touch, or be completely disconnected from each other.
Here is the question:
Imagine that we form a single polygon from the endpoints of our two lines. It could be a triangle, or a trapezoid - depending on if the an endpoint of one line lays on the other line or not . Given an arbitrary point (x5,y5) on my cartesian plane, I want to know if that point lies within the bounds of the polgyon.
You need to find a way to describe the polygon, for example by finding some sort of "paramertrization". I think this could be done by definining a mapping f: [0,1]^2 -> R^2 where f([0,1]x{0}) is the first line segment and f([0,1]x{1}) is the second line segment. Then the question is equivalent to checking if the point lays in the image of this map.
How is your original polygon even constructed? Is it just p1-> p2 -> p3 -> p4 -> p1? That would result in basically 2 triangles with a common point in the case of the two original lines intersecting. (or in cases where the connection lines intersect)
Or are you magically reordering your input points so it always ends as a single polygon? Then how is this magic reordering defined? Because there are multiple possibilities for a given input of 4 points... And if you magically reorder, then what was the meaning of those 2 initial lines anyway? "Here are two lines, they might be opposing edges or maybe diagonals of your new polygon"
The easiest way to put it is that the constructed polgyon is the one with the maximum size, and that the original lines can be included within the bounds of the polygon (for example if the line criss crossed such that they made a shape like a plus sign or a cross, then the polygon would be the diamond around that cross)
On July 07 2019 08:41 travis wrote: The easiest way to put it is that the constructed polgyon is the one with the maximum size, and that the original lines can be included within the bounds of the polygon (for example if the line criss crossed such that they made a shape like a plus sign or a cross, then the polygon would be the diamond around that cross)
I wouldve read the assignment as taking the convex hull of the endpoints (prob mostly because convex combinations are what I usually encounter...). Then it's easy to describe the points that lie inside the set. However, I absolutely agree that the task is not worded in a well-defined way from what I can tell.
On July 07 2019 08:41 travis wrote: The easiest way to put it is that the constructed polgyon is the one with the maximum size, and that the original lines can be included within the bounds of the polygon (for example if the line criss crossed such that they made a shape like a plus sign or a cross, then the polygon would be the diamond around that cross)
On July 07 2019 08:41 travis wrote: The easiest way to put it is that the constructed polgyon is the one with the maximum size, and that the original lines can be included within the bounds of the polygon (for example if the line criss crossed such that they made a shape like a plus sign or a cross, then the polygon would be the diamond around that cross)
That is not guaranteeing a unique solution.
Can you give me an example of why not?
Offhand, imagine the following scenario I whipped up in paint:
The two line segments are perpendicular, and C and D are equidistant from B. Therefore A is equidistant from C and D as well. Do you connect AC and BD, or AD and BC?
Ah, maybe I miscommunicated that all endpoints needed to be along the boundary of the polygon. They don't. In that example, we would construct the triangle A-C,C-D,D-A
Okay, so we are simply talking about the convex hull of the 4 input points?
And let's just never mention those two lines again, because they really also don't seem to matter here.
Okay.... so now that it seems like we finally figured out the task.... for your other problem in the most generic way and compatible with the x/y description you gave originally:
As your shape is convex, you can simply describe the inside of your polygon by 3 or 4 inequalities (one for each side), so you only have to check if your point of question is fulfilling those.
So if we are talking about the most brute forcish algorithm possible: (the final states are bolded) 1) You loop over all possible unordered pairs of points from your 4 points. 1a) There are no pairs left: You are done. The point is inside the polygon 1b) There are still pairs left: -> 2) 2) You calculate the line through those points 3) You check on which side of your line the other two original points are. 3a) They are on different sides -> go back to 1) 3b) They are on the same side -> 4) 4) You check if your point of question is on the same side: 4a) yes -> go back to 1) 4b) no -> abort, the point is not inside your polygon
The cases where a point is on the line in 3) are trivial, in the case of the point being on the line in 4) it really depends on how you define "in the polygon"
It turns out that these results are pretty easy to calculate, as 12 = 3 mods 9, and 3^2 = 0 mod 9, so 12 ^even number = 0 mod 9. Similarly, 12 = -1 (mod 11), so 12 ^even number = 1 (mod 11)
Since 61917ab4224 = 12^10, this equality must also be true for the mod results. Then try to figure out what you can say about 61917ab4224 (mod 9) or (mod 11), and see if this leads you to a conclusion.
So 61917ab4224 = 0 (mod 9) and 61917ab4224 = 1 (mod 11). You can use the sum of the digits to deal with the (mod 9) stuff. 6+1+9+1+7+4+2+2+4+ a + b must equal 0(mod9), because if 61917ab4224 is dividable by 9, which it is, then it's digit sum must be a multiple of 9, so 0(mod9). And since 6+1+9+1+7+4+2+2+4 =36 = 0, this means that a+b = 0 (mod 9). If you can actually prove that ab is a multiple of 12, this means that ab must be 36 or 72. Otherwise, it could also be 18, 27, 36, 45, 54, 63, 72, 81, 90, 99. But we have greatly reduced the amount of possible combinations here.
So lets look at the (mod 11) stuff now. 61917ab4224 = 1 (mod 11). I actually have no idea what to expect here or how to deal with this, so i will just write what i do, and we will see if something useful is the result. We can remove any multiple of 11 from the right side of this equation. So 61917ab4224 = 1 (mod 11) = 50917ab4224 = 06917ab4224 = 00317ab4224 = 00097ab4224 = 00009ab4224 = 9ab4224 = 9ab2002=9ab0902 = 9ab0022 = 9ab0000
So 9ab0000=1 (mod11). This means directly that 9ab can not be dividable by 11 (not that interesting, but it kicks out a few of our possibilities. since 990 = 0 (mod11), this means ab cannot be 90. 79, 68,57,46,35,24,13 are all not on our list of possible results, so we don't really gain anything interesting.
I get the feeling that maybe 9ab must also be equal to 1 (mod 11), that would directly lead us to 36 being the result (which a quick check of 12^10 with the calculator shows me is true), but i am not quite certain why that would be the case. Can we just get rid of zeros at the end of the number (clearly not true in general for mod calculation, as 50(mod 6) = 2, while 5 (mod6) = 5, but maybe it is true for (mod11), or for (mod prime)? might be that there is a theorem for this, but i don't remember). (Also not true for mod prime, as 50 (mod 7) = 1, while 5 mod 7 = 2
So lets think about this. If a<b, we can remove aa0000 as a multiple of 11 from 9ab0000. This means that 90(b-a)0000 = 1, but this doesn't seem fruitful.
AHA! 10 = -1 (mod 11), so dividing by 10 is the same as dividing by -1 (I am pretty sure that i can divide in mod 11 like this). This means that 9ab0000/10000 = 1/(-1)^4 ===> 9ab = 1(mod 11), which is true for 991, 980, 969, 958, 947, 936, 925, 914, 903. And that means that ab is element of {91, 80, 68,58,47,36, 25, 14, 03}. But we already know from the first part that ab is elemnt of {18, 27, 36, 45, 54, 63, 72, 81, 90, 99}. The only overlap between the two is 36, so ab must be 36.
(Turns out the above is actually incorrect, i looked it up and i cannot generally divide like that in modulo operations. However, i can divide if the greatest common denomitor of the number i want to divide by (10) and the modulo number (11) is 1, which it is)
I hope that this helps with how one can approach such a problem without actually knowing what to do, or remembering a lot of the relevant theorems. There is probably a way simpler solution for the mod 11 stuff, but since i have no idea what that is, i choose to rather do an ugly sloppy solution than none.
On August 13 2019 01:23 Starlightsun wrote: What are some real life examples where data makes a perfect parabola?
- The braking distance of cars follows a quadratic equation, which is useful in the contexts of traffic reports, the DMV, and car accident investigations. - Professional landscaping and optimizing the positions of sprinkler systems in gardens and lawns for efficient watering use quadratics, since the water’s trajectory is always an arc (a parabola). - Civil engineers and architects frequently design and build structures that require parabolas, such as tunnels, bridges, dams, and buildings. - Any sort of projectile or projectile motion (trajectory) involves parabolas, such as: throwing darts, firing arrows, shooting bullets, throwing or kicking or hitting balls, launching rockets, firing cannonballs, juggling objects, jumping off ski jumps or bike jumps, horseback riding and jumping, dolphins jumping, riding snowboards and skateboards off half-pipes, etc. - Parabolic surfaces are used to magnify acoustics in auditoriums and speakers. - Parabolic surfaces are used in optics to optimize vision and visual displays, such as with camera lenses, lenses of glasses, and contact lenses. - Parabolic surfaces are used to generate, focus, and transfer heat, light, and energy in heaters, headlights, satellite dishes, and other devices. - Many roller coaster tracks and other amusement park rides are parabolic for safety and speed reasons.
Thanks DarkPlasmaBall. That's fascinating how many applications there are. I'm learning quadratics in algebra and just wondered what they're used to model.
On August 13 2019 03:09 Starlightsun wrote: Thanks DarkPlasmaBall. That's fascinating how many applications there are. I'm learning quadratics in algebra and just wondered what they're used to model.
No problem! After completing our quadratics unit, I always assign my Algebra 2 students a paper... to write about any application of quadratics that they want, based on their hobbies, interests, experiences, etc.
At least get your syntax right. Your first one is missing an = sign, and either brackets or a multiplication operator, in which case the answer is obvious, as the alternatives aren't interesting.
But more importantly, try solving them yourself and tell us where you get stuck, rather than asking us to do your homework.
Sorry dragontattoo, but this isn't a homework forum and you've mistyped at least one math problem anyway. Please consider asking your math teacher for extra help; I'm sure they would want to know if you're unclear about any of the material you're learning in class. Good luck!
I'd also be willing to help here, if you first make some effort to solve the problems, and formulate more clearly where your problems understanding the subject are.
I am doing a data science project, it's a project of my choosing for my final project.
I am hypothesizing that there is official bias from some or all officials in NBA basketball, in the form of unfairly officiating certain games so that the underdog team is more likely to win. Essentially, referee corruption. A common claim among viewers of the NBA.
I've collected data from 2012-2018 seasons, I know what refs officiated which games, I know the quantity of fouls called and their types, and which teams they were called on. I have the pre-game moneylines for betting (the odds), so I can see which teams was the underdog and how big of an underdog they were. I also have the results of the game.
Now, I can visualize this data. I can find and point out discrepancies.
My question is, what is a good statistically sound method by which to best "prove" this bias? Something like a confidence %. I know this is a vague question, if more information is needed just ask and I will do my best to explain.
Well this sounds interesting, but I have some questions about the general idea first:
-Before you get to the rather complicated analysis of "underogs are more likely to win because of reason X", have you checked if the underdogs did in fact win more often then they should have? Because that's something you should be able to check with just the odds and results, and no further data. If your assumption of biased refs is right, you should notice a difference here. If not, well then I suppose even a more detailed analysis would be unlikely to show any ref bias.
-Quantity of fouls alone does not seem to be enough: I could imagine a lot of reasons why the better or the worse team might actually commit more fouls, and therefore and unbiased ref would correctly call more fouls one kind of teams. For example, while I'm no expert on basketball, the losing team seems to commit a lot of tactical fouls in the final minutes of the game when they think they still have a chance to win. Assuming the underdogs are more likely to be in a losing position, you might find out that the referees are more likely to call fouls on the underdogs. But it's not a sign of bias (against underdogs), it might just be a natural sign of how the games is being played.
-If you "only" consider fouls that were acutally called, you might not notice that "potential" bias also can show as non-calls of actual fouls. It seems to me that ideally you should have data of comparing actual fouls/non-fouls to calls/non-calls. Obviously, such data might be impossible to obtain however.
-Also no, even if referees are more likely to help the underdog, this does not automatically mean they are corrupt. It's a quite common thing that neutral tend to side with the underdogs in sports contests, and maybe referees are not immune to this either (on a subconscious level).
Another question: how do you even get the odds of a team winning? Betting odds aren't unbiased either. There are many things that go into sports gambling that have nothing to do with the team's chance of winning. So treating that as your ground truth for building a referee bias detector is questionable as well.
On November 09 2019 19:36 Mafe wrote: Well this sounds interesting, but I have some questions about the general idea first:
-Before you get to the rather complicated analysis of "underogs are more likely to win because of reason X", have you checked if the underdogs did in fact win more often then they should have? Because that's something you should be able to check with just the odds and results, and no further data. If your assumption of biased refs is right, you should notice a difference here. If not, well then I suppose even a more detailed analysis would be unlikely to show any ref bias.
Good question, I plan on investigating a bunch of stuff like that. And, even this question, "do the underdogs win more than they should", is a tough question to answer. I suppose expected value from betting on the winner every game should be roughly even money over the long term. Does that sound correct?
-Quantity of fouls alone does not seem to be enough: I could imagine a lot of reasons why the better or the worse team might actually commit more fouls, and therefore and unbiased ref would correctly call more fouls one kind of teams. For example, while I'm no expert on basketball, the losing team seems to commit a lot of tactical fouls in the final minutes of the game when they think they still have a chance to win. Assuming the underdogs are more likely to be in a losing position, you might find out that the referees are more likely to call fouls on the underdogs. But it's not a sign of bias (against underdogs), it might just be a natural sign of how the games is being played.
Really good point and a big mistake on my part to not even think about that. I don't watch basketball much these days but I've seen enough that this should have been a fairly obvious factor.
It will be a lot of work but I could tag fouls/violations with some kind of extra value that contains information on how far ahead or behind the fouling team is. Also, I do plan on tracking quarter of the game as well - I think 2nd half data or even 4th quarter data may end up being way more important than data from entire games.
Anyways this was a crucial thing to bring up, thanks.
-If you "only" consider fouls that were acutally called, you might not notice that "potential" bias also can show as non-calls of actual fouls. It seems to me that ideally you should have data of comparing actual fouls/non-fouls to calls/non-calls. Obviously, such data might be impossible to obtain however.
Man, I wish such data reliably existed. There is something called "the two minute report", and it does include no-calls. But sadly this data is only available for like 2017 onwards, and the report is only given for games where teams were within 3 points in the last two minutes. And the data is of course only for the last two minutes of the game.
But still, it might be a cool thing to take a look at.
But, I can at least compare actual calls on the winner vs expected calls on the winner (where expected in based on how many calls the losers opponents normally get, and how many calls the winner normally gets). Winner and loser would refer to the specific teams, not league averages, of course.
-Also no, even if referees are more likely to help the underdog, this does not automatically mean they are corrupt. It's a quite common thing that neutral tend to side with the underdogs in sports contests, and maybe referees are not immune to this either (on a subconscious level).
That's a valid point and I certainly was never going to explicitly say that anyone was corrupt (i'd avoid even using the word).
However I was thinking about this, and I think it's important that I don't just look at betting odds for "who will win", but also look at point spreads. It's harder to analyze, but tying foul/violation calls to a team beating a point spread could also be pretty interesting. Again, doesn't prove corruption, but I'm not reaaaally going to say that this is what I am trying to prove.
On November 09 2019 22:40 Acrofales wrote: Another question: how do you even get the odds of a team winning? Betting odds aren't unbiased either. There are many things that go into sports gambling that have nothing to do with the team's chance of winning. So treating that as your ground truth for building a referee bias detector is questionable as well.
I found a good historical odds dataset for NBA, it has the opening and closing moneylines and point spread for 3 different betting websites.
You say betting odds aren't unbiased but I think they are by far the least biased estimator. They are certainly better than published statistical methods I've seen. I was considering also doing neural network prediction as well just for fun and comparing it to betting odds predictions.
But since you bring this up, it could be useful to measure just how good of a predictor the various types of betting odds are over my dataset and include this information.
Very interesting project idea! The first thing I was thinking is a probit model, where you'd use betting odds and whatever else data you have as explanatory variables to study whether the underdog wins or loses. It is a fairly standard way to estimate models with binary variables (such as your win/loss). You could also see whether the possible bias is equally large in different sub samples of your data, whether it is the same for every team or is there a difference if the betting odds are close or far away? Or that one ref or another may favour underdogs more? There is always a risk of running into really small subsamples if you start doing this, tho. But all in all I'm going to throw a wild guess that even if there is some true underlying effect just studying the whole league at once is going to drown it out in noise, it will be probably more interesting to study whether the bias exists in different subsamples and whether it is different from one subsample to another.
I'll assume that by saying "prove" you already knew that there isn't a bog standard way of doing this, and you are going to have to regardless argue why your chosen methodology is valid and why it is better than some other methods. I know there is a fair bit of literature on football (soccer) betting in the UK, I'd say that might be a good starting place to see how betting odds are treated in and what kind of statistical models are used when working with them.
On November 09 2019 03:25 travis wrote: Alright, could use some advice [statistics]
I am doing a data science project, it's a project of my choosing for my final project.
I am hypothesizing that there is official bias from some or all officials in NBA basketball, in the form of unfairly officiating certain games so that the underdog team is more likely to win. Essentially, referee corruption. A common claim among viewers of the NBA.
I've collected data from 2012-2018 seasons, I know what refs officiated which games, I know the quantity of fouls called and their types, and which teams they were called on. I have the pre-game moneylines for betting (the odds), so I can see which teams was the underdog and how big of an underdog they were. I also have the results of the game.
Now, I can visualize this data. I can find and point out discrepancies.
My question is, what is a good statistically sound method by which to best "prove" this bias? Something like a confidence %. I know this is a vague question, if more information is needed just ask and I will do my best to explain.
May I know how you got along with this? sounds interesting.
John Horton Conway, to many known due to the Game of Life, but generally a great contributor in game theory and other mathmatical fields has passed away on Saturday due to Corona
Depends on your goals for that system. Just having "a system" isn't hard. The winner gets one point, the loser loses on. This obviously has a lot of flaws.
So figure out exactly what you want, once you can formulate that clearly, you should be able to figure out how to design a system based on that.
Or read up on other rating systems, and try to understand why they are designed the way they are. Probably starting at the chess Elo rating.
Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
From what I've heard, James R Munkres 'Topology' is a great textbook. I haven't read it, but my experience of his writing in 'Analysis on Manifolds' has been wonderful, he's very clear and insightful.
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
From what I've heard, James R Munkres 'Topology' is a great textbook. I haven't read it, but my experience of his writing in 'Analysis on Manifolds' has been wonderful, he's very clear and insightful.
God that was fast. Thanks man, gonna check out it!
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
From what I've heard, James R Munkres 'Topology' is a great textbook. I haven't read it, but my experience of his writing in 'Analysis on Manifolds' has been wonderful, he's very clear and insightful.
That was the textbook for my uni's course in topology. It is a good book, but you need a pretty solid basis, mainly in algebra, before tackling this, as it is definitely not an easy topic, and he assumes quite a lot of prior knowledge.
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
From what I've heard, James R Munkres 'Topology' is a great textbook. I haven't read it, but my experience of his writing in 'Analysis on Manifolds' has been wonderful, he's very clear and insightful.
God that was fast. Thanks man, gonna check out it!
I learned first from Munkres' book and I find it is good for general topology, i.e., if you do not have a specific direction in mind (geometry, functional analysis, etc.). But if you would like to go in a specific direction, for example for geometry, then I would recommend reading Lee's Introduction to topological manifolds - I find that this book gives you a much better understanding of the relationship between topological definitions and geometrical intuition, and as Munkres has also a lot of exercises to chew through.
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
From what I've heard, James R Munkres 'Topology' is a great textbook. I haven't read it, but my experience of his writing in 'Analysis on Manifolds' has been wonderful, he's very clear and insightful.
God that was fast. Thanks man, gonna check out it!
I learned first from Munkres' book and I find it is good for general topology, i.e., if you do not have a specific direction in mind (geometry, functional analysis, etc.). But if you would like to go in a specific direction, for example for geometry, then I would recommend reading Lee's Introduction to topological manifolds - I find that this book gives you a much better understanding of the relationship between topological definitions and geometrical intuition, and as Munkres has also a lot of exercises to chew through.
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
I don't mind expensive books, I don't mind dry books. One of my favorite books that tried to teach me something is Stephen Prata's C Primer Plus. I like books because authors put more effort into compilating material. Maybe some maths profs on TL here want to recommend their own book on the topic? Most important to me are clearness and straight forwardness in expression.
Any book recommendations for self-studying operations research? __________________ I don't mind expensive...
From what I've heard, James R Munkres 'Topology' is a great textbook. I haven't read it, but my experience of his writing in 'Analysis on Manifolds' has been wonderful, he's very clear and insightful.
God that was fast. Thanks man, gonna check out it!
I learned first from Munkres' book and I find it is good for general topology, i.e., if you do not have a specific direction in mind (geometry, functional analysis, etc.). But if you would like to go in a specific direction, for example for geometry, then I would recommend reading Lee's Introduction to topological manifolds - I find that this book gives you a much better understanding of the relationship between topological definitions and geometrical intuition, and as Munkres has also a lot of exercises to chew through.
Seems like Lee has a nice appendix ^^.
Well, it's mostly a review of definitions and basic results. Just to reiterate: with Munkres you would definitely obtain a more comprehensive knowledge in topology - Lee concentrates on the geometric aspects (e.g. he has a nice chapter on the classification of 2-dim. compact manifolds), but barely touches other typical stuff one does in general topology - conditions when a topological space is metrizable (interesting in itself), Tychonoff theorem, Stone-Čech compactification, and Baire theory (all three important in functional analysis).
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
If the DE in your name means german, then you could check out the book "Allgemeine Topologie" by Rene Bartsch, simply because it is written in an enticing, non-dry way in its prose, introductions and motivations but is still a proper maths book. Rene used to give a Topology lecture here at TU Darmstadt that followed it (for obvious reasons), although I'm not sure if the lecture came before the book or vice versa.
I mainly remember the book because the better students in the lecture enjoyed his lecture and the book so much, quite a bunch of them bought the book and proof-read it thoroughly, some accumulating a decent amount of corrections that made it into the errata (available on Rene's webpage) and because the small part of the book I read (Uniform Spaces) was a great read, at least for me.
There are two ways of teaching/learning topology. There is the direct method of starting with definitions and examples, and there is a geometric approach that treats topology as a generalisation of metric spaces (where there is a concept of distance).
The direct method is more abstract, but quicker. It is probably better for students who prefer set theory or logic. This is the approach taken by Munkres.
The geometric approach is more common because most students find it more intuitive. This approach defines metric spaces first and gives properties, theorems, etc about them. Then the definition of a topological space is given, followed by the same/analogous properties, theorems, etc. For books that take this approach you can try Mendelson's 'Introduction to Topology' or Sutherland's 'Introduction to Metric and Topological Spaces'.
There are also many notes freely available online; lots of professors put up lecture notes.
On August 04 2020 22:53 naughtDE wrote: Any book recommendations for thoroughly self-studying topology?
If the DE in your name means german, then you could check out the book "Allgemeine Topologie" by Rene Bartsch, simply because it is written in an enticing, non-dry way in its prose, introductions and motivations but is still a proper maths book. Rene used to give a Topology lecture here at TU Darmstadt that followed it (for obvious reasons), although I'm not sure if the lecture came before the book or vice versa.
I mainly remember the book because the better students in the lecture enjoyed his lecture and the book so much, quite a bunch of them bought the book and proof-read it thoroughly, some accumulating a decent amount of corrections that made it into the errata (available on Rene's webpage) and because the small part of the book I read (Uniform Spaces) was a great read, at least for me.
It does not, but it is one of the languages I speak. I tried to read some of the first semester stuff from Siegfried Bosch years ago, but basically came running back to English, as German when used in a scientific context can become the most obfuscated language. I just had a look inside "Allgemeine Topologie" by Rene Bartsch and he surely puts things very clearly, maybe I give German another go.
On October 08 2020 21:12 greys wrote: I study math in college right now, it is probably the hardest subject for me
the hardest math course i ever took was MAT 344: Applied Combinatorics. It had two midterms worth 25% and a final exam worth 50%. The first midterm occurred just before you were still eligible to drop the course without incurring an academic penalty. There were 30 people in the class before the first midterm. There were 14 remaining after that first midterm exam. More than half the class bailed.
On October 08 2020 21:14 Simberto wrote: Very reasonable to bump a thread which has had no activity for three months with that information. Are you an adbot?
I seem to have exhausted my other options, so I thought maybe I'd ask here. Is there an analytical method that would be suitable for the problem I've described below? I'm looking for a way to correlate one response variable with multiple independent variables.
I'm working on several different materials. Each of them consists of 3-4 phases. Their phase composition overlaps to some extent. Each phase in a given material has a characteristic volume fraction (F). The phase properties I'd be looking at are elastic modulus (E), hardness (H) and plastic yield strength (Y). They may take somewhat different values between nominally the same phases in different materials. Some phases can be inherently brittle and not deform plastically, which means they don't have Y that could be measured. Each material has an overall wear resistance (R). Below is an illustration of what I mean:
material #1: phase A (F, E, H, Y) + phase B (F, E, H, Y) + phase C (F, E, H) -> R
material #2: phase A (F, E, H, Y) + phase C (F, E, H) + phase D (F, E, H) -> R
material #3: phase B (F, E, H, Y) + phase Z (F, E, H, Y) -> R
A phase not being present in a given alloy is equal to it having a fraction equal zero. F is a weighting coefficient in a way. The relationship between phase properties or phase fraction and R is not necessarily linear.
We'd like to be able to predict R based on F, E, H and Y of the constituent phases (if there is indeed a link), as well as determine which variables may be superfluous.
On December 04 2020 20:29 maybenexttime wrote: I seem to have exhausted my other options, so I thought maybe I'd ask here. Is there an analytical method that would be suitable for the problem I've described below? I'm looking for a way to correlate one response variable with multiple independent variables.
I'm working on several different materials. Each of them consists of 3-4 phases. Their phase composition overlaps to some extent. Each phase in a given material has a characteristic volume fraction (F). The phase properties I'd be looking at are elastic modulus (E), hardness (H) and plastic yield strength (Y). They may take somewhat different values between nominally the same phases in different materials. Some phases can be inherently brittle and not deform plastically, which means they don't have Y that could be measured. Each material has an overall wear resistance (R). Below is an illustration of what I mean:
material #1: phase A (F, E, H, Y) + phase B (F, E, H, Y) + phase C (F, E, H) -> R
material #2: phase A (F, E, H, Y) + phase C (F, E, H) + phase D (F, E, H) -> R
material #3: phase B (F, E, H, Y) + phase Z (F, E, H, Y) -> R
A phase not being present in a given alloy is equal to it having a fraction equal zero. F is a weighting coefficient in a way. The relationship between phase properties or phase fraction and R is not necessarily linear.
We'd like to be able to predict R based on F, E, H and Y of the constituent phases (if there is indeed a link), as well as determine which variables may be superfluous.
Sounds like a regression problem. If you're explicitly interested in finding your superfluous variables, lasso regression is good, and in general any linear regression model is a good place to start.
If your problem is very non-linear and not easily linearized, you can use a decision tree or forest regression model, but it won't give you a neat formula that looks like the ones you described.
On December 04 2020 20:29 maybenexttime wrote: I seem to have exhausted my other options, so I thought maybe I'd ask here. Is there an analytical method that would be suitable for the problem I've described below? I'm looking for a way to correlate one response variable with multiple independent variables.
I'm working on several different materials. Each of them consists of 3-4 phases. Their phase composition overlaps to some extent. Each phase in a given material has a characteristic volume fraction (F). The phase properties I'd be looking at are elastic modulus (E), hardness (H) and plastic yield strength (Y). They may take somewhat different values between nominally the same phases in different materials. Some phases can be inherently brittle and not deform plastically, which means they don't have Y that could be measured. Each material has an overall wear resistance (R). Below is an illustration of what I mean:
material #1: phase A (F, E, H, Y) + phase B (F, E, H, Y) + phase C (F, E, H) -> R
material #2: phase A (F, E, H, Y) + phase C (F, E, H) + phase D (F, E, H) -> R
material #3: phase B (F, E, H, Y) + phase Z (F, E, H, Y) -> R
A phase not being present in a given alloy is equal to it having a fraction equal zero. F is a weighting coefficient in a way. The relationship between phase properties or phase fraction and R is not necessarily linear.
We'd like to be able to predict R based on F, E, H and Y of the constituent phases (if there is indeed a link), as well as determine which variables may be superfluous.
Sounds like a regression problem. If you're explicitly interested in finding your superfluous variables, lasso regression is good, and in general any linear regression model is a good place to start.
If your problem is very non-linear and not easily linearized, you can use a decision tree or forest regression model, but it won't give you a neat formula that looks like the ones you described.
Thanks for the suggestion. I will have a look.
I considered multiple linear regression at the very start, but wasn't sure whether it's suitable for this kind of problem. The variables are not truly independent here. E.g. (1) the phase fraction must somehow affect how each phase's properties contribute to the net R value. If the fraction of a phase is at 0.05 you'd expect its impact to be low regardless of its properties. (2) H is a function of and E and Y, but not in a straightforward way - different combinations of E and Y can give the same H, but H generally increases with increasing E or Y. I would've skipped H altogether if it weren't for the fact that some phases don't have Y at all. (3) E, H and Y are a function of the phases' chemical composition and crystal structure.
I'm not sure how to treat F. Should it be a variable alongside E, H and Y or rather another coefficient modulating them?
It's hard to say how non-linear the problem is. Studies on simpler materials (single phase, with maybe some fraction of a strengthening phase) revealed some general trends in terms of Y and H/E ratio, but not a clear function. Most likely due to other confounding factors, such as the crystal structure or oxide type.
On December 04 2020 20:29 maybenexttime wrote: I seem to have exhausted my other options, so I thought maybe I'd ask here. Is there an analytical method that would be suitable for the problem I've described below? I'm looking for a way to correlate one response variable with multiple independent variables.
I'm working on several different materials. Each of them consists of 3-4 phases. Their phase composition overlaps to some extent. Each phase in a given material has a characteristic volume fraction (F). The phase properties I'd be looking at are elastic modulus (E), hardness (H) and plastic yield strength (Y). They may take somewhat different values between nominally the same phases in different materials. Some phases can be inherently brittle and not deform plastically, which means they don't have Y that could be measured. Each material has an overall wear resistance (R). Below is an illustration of what I mean:
material #1: phase A (F, E, H, Y) + phase B (F, E, H, Y) + phase C (F, E, H) -> R
material #2: phase A (F, E, H, Y) + phase C (F, E, H) + phase D (F, E, H) -> R
material #3: phase B (F, E, H, Y) + phase Z (F, E, H, Y) -> R
A phase not being present in a given alloy is equal to it having a fraction equal zero. F is a weighting coefficient in a way. The relationship between phase properties or phase fraction and R is not necessarily linear.
We'd like to be able to predict R based on F, E, H and Y of the constituent phases (if there is indeed a link), as well as determine which variables may be superfluous.
Sounds like a regression problem. If you're explicitly interested in finding your superfluous variables, lasso regression is good, and in general any linear regression model is a good place to start.
If your problem is very non-linear and not easily linearized, you can use a decision tree or forest regression model, but it won't give you a neat formula that looks like the ones you described.
Thanks for the suggestion. I will have a look.
I considered multiple linear regression at the very start, but wasn't sure whether it's suitable for this kind of problem. The variables are not truly independent here. E.g. (1) the phase fraction must somehow affect how each phase's properties contribute to the net R value. If the fraction of a phase is at 0.05 you'd expect its impact to be low regardless of its properties. (2) H is a function of and E and Y, but not in a straightforward way - different combinations of E and Y can give the same H, but H generally increases with increasing E or Y. I would've skipped H altogether if it weren't for the fact that some phases don't have Y at all. (3) E, H and Y are a function of the phases' chemical composition and crystal structure.
I'm not sure how to treat F. Should it be a variable alongside E, H and Y or rather another coefficient modulating them?
It's hard to say how non-linear the problem is. Studies on simpler materials (single phase, with maybe some fraction of a strengthening phase) revealed some general trends in terms of Y and H/E ratio, but not a clear function. Most likely due to other confounding factors, such as the crystal structure or oxide type.
Couple of approaches here. The first is to just try it and see. This is almost always the first approach as it is simple. In particular the tree/forest models deal fairly well with covariant variables, but even a simple linear regression model with l1 and/or l2 regularization can do well. It all depends on the data and the problem.
If these don't work well at all, modelling it is going to be tough and you may need some feature engineering to get more information out of your data. However, you can still use some other tricks to tease out some of the covariance. You can use LDA to transform your data to an orthogonal base, and the underlying coefficients will tell you which variables were particularly informative. PCA does something similar, but doesn't take your dependent variable into account specifically. Either can help you weed out some of the covariance.
Thanks a bunch! Many of those terms sound completely foreign to me. I will have to do some reading. :-) I've been exploring the possibility of using dimensional analysis (suggested by one professor), but the problem is so different from the examples shown in the literature, that I got lost.
Could you tell me if those methods are suitable for problems which do not feature large datasets? I have a total of 7 materials (M1-M7), composed of 5 different phases (A, B, X, C, D). Each of them consists of 2-3 phases (I decided to ignore some minor phases to makes things simpler) and has a net R value. Each of those phases is characterized by 3-4 parameters (F, E, H and possibly Y). R is a function of the properties of the phases that comprise the material (how do I add lower indices?):
Since the materials don't contain all the phases, the fractions of the non-present phases effectively equal zero while the other phase parameters do not exist (which I suppose is different from being equal to zero?).
We only get one set of values for each material because R and F, E, H and Y are measured is different experiments, using different samples. We can't measure R twice for a given material, then measure F, E, H and Y twice, and say this set of F, E, H and Y corresponds to this R and that set to that R. The number of data points is equal to the number of materials studied, and it's much lower than the number of variables. Is this a problem?
Edit: According to Wiki, lasso does perform variable selection, which is somewhat reassuring. :-)
Feature selection techniques are often used in domains where there are many features and comparatively few samples (or data points). Archetypal cases for the application of feature selection include the analysis of written texts and DNA microarray data, where there are many thousands of features, and a few tens to hundreds of samples.
On December 10 2020 22:55 maybenexttime wrote: Thanks a bunch! Many of those terms sound completely foreign to me. I will have to do some reading. :-) I've been exploring the possibility of using dimensional analysis (suggested by one professor), but the problem is so different from the examples shown in the literature, that I got lost.
Could you tell me if those methods are suitable for problems which do not feature large datasets? I have a total of 7 materials (M1-M7), composed of 5 different phases (A, B, X, C, D). Each of them consists of 2-3 phases (I decided to ignore some minor phases to makes things simpler) and has a net R value. Each of those phases is characterized by 3-4 parameters (F, E, H and possibly Y). R is a function of the properties of the phases that comprise the material (how do I add lower indices?):
Since the materials don't contain all the phases, the fractions of the non-present phases effectively equal zero while the other phase parameters do not exist (which I suppose is different from being equal to zero?).
We only get one set of values for each material because R and F, E, H and Y are measured is different experiments, using different samples. We can't measure R twice for a given material, then measure F, E, H and Y twice, and say this set of F, E, H and Y corresponds to this R and that set to that R. The number of data points is equal to the number of materials studied, and it's much lower than the number of variables. Is this a problem?
Edit: According to Wiki, lasso does perform variable selection, which is somewhat reassuring. :-)
Feature selection techniques are often used in domains where there are many features and comparatively few samples (or data points). Archetypal cases for the application of feature selection include the analysis of written texts and DNA microarray data, where there are many thousands of features, and a few tens to hundreds of samples.
Basically the more variables you have and the fewer samples, the greater the chance for overfitting, so generally you'll want to use the methods to prevent overfitting more aggressively. Regularization and feature selection are techniques that do exactly that. Dimensionality reduction (with LDA, PCA or a few more complex methods) are another (alternative/complementary) way of doing that.
But you need to just dive in and start experimenting with the various techniques, as there is no theoretical best way to do this (well, I guess in theory, there is, we just don't know what it is, and it is different for every problem).
One word of caution: with so few samples, and if running more experiments is hard and expensive, then be extra careful to set aside a few samples for your test set and *do not ever use them*. You can then use various ways for splitting your train and validation data, and once you think you have a method that works well, then use your test data to make sure that is indeed a good method. If you use your test data too much to guide the rest of the process, you can still be overfitting and will need new different data to test whether your method actually works!
On December 17 2020 20:03 Acrofales wrote:Basically the more variables you have and the fewer samples, the greater the chance for overfitting, so generally you'll want to use the methods to prevent overfitting more aggressively. Regularization and feature selection are techniques that do exactly that. Dimensionality reduction (with LDA, PCA or a few more complex methods) are another (alternative/complementary) way of doing that.
But you need to just dive in and start experimenting with the various techniques, as there is no theoretical best way to do this (well, I guess in theory, there is, we just don't know what it is, and it is different for every problem).
I've been learning about those methods from StatQuest. Your recommendations seem well-suited for the problem I'm working on. Very solid advice. You've been really helpful. :-)
I'm not sure if it's possible to solve it this way, but it's my best bet. Even if it doesn't work out, I'll still have plenty of data to analyze qualitatively.
One word of caution: with so few samples, and if running more experiments is hard and expensive, then be extra careful to set aside a few samples for your test set and *do not ever use them*. You can then use various ways for splitting your train and validation data, and once you think you have a method that works well, then use your test data to make sure that is indeed a good method. If you use your test data too much to guide the rest of the process, you can still be overfitting and will need new different data to test whether your method actually works!
Thanks for the warning. I'm aware of that. Having this few data points will make splitting my data tricky. Unfortunately, adding more data points would require making new materials and testing them. That would probably cost several thousand pounds per data point. :-P